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Preface 



This summary report comprises two chapters (Introduction 
and Recommendations) of the panel's full study together 
with a brief summary of the rest of its work. It 
summarizes what we have learned about the uses and abuses 
of polls and surveys, mentions some technical problems of 
measurement and error in surveys/ offers observations on 
the analysis of survey data, and reviews some promising 
ideas for improving survey methods. The basis for this 
summary report and for the subsequent recommendations is 
provided by two separately published volumes: the full 
study completed by the panel itself and a set of 
technical papers commissioned by the panel. The contents 
of those volumes are listed on pp. ix-x. 

This summary report is intended for nonspecialists. 
The panel has prepared it because surveys of subjective 
phenomena can affect public discourse and public decision 
making. Survey practitioners thereby incur an obligation 
to make the problems and progress of this enterprise 
known outside their profession. Since the panel exhorts 
others to meet this obligation (see Recommendation 2) , we 
proffer this document as our own contribution to improved 
public understanding. 

The panel was first convened in January 1980 and held 
five plenary meetings (and assorted working group 
sessions) in the ensuing 13 months. Prior to January 
1980, a small group had met to begin exploring some of 
the issues treated in this report. That meeting was 
hosted by the Institute for Research in the Social 
Sciences at the University of North Carolina and attended 
by James A. Davis, Frank Munger, Robert Parke, Mark 
Schulman, D. Garth Taylor, and several members of this 
panel. That meeting planted a seed that was nurtured 
through a long period of development and review. David 
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Goslin, Sara Kiesler, Edwin Goldfield, and Margaret 
Martin at the Assembly of Behavioral and Social Sciences 
and Robert Parke at the Social Science Research Council 
provided valuable assistance during that period. 

This study was funded by a grant from the National 
Science Foundation, and we are grateful for that support 
and for the advice given during the early stages of this 
work by Murray Aborn. 

The panel's work benefited from the individual 
contributions of the authors commissioned to undertake 
special studies presented in Volume 2 of Surveying 
Subjective Phenomena. The panel also profited from a 
discussion with Saunders Mac Lane and from critiques of 
its work by reviewers appointed by the Committee on 
National Statistics and the Assembly of Behavioral and 
Social Sciences of the National Research Council, At the 
request of the editors/ James A. Davis provided a 
detailed critique that helped greatly in our revisions. 
Naomi D. Rothwell, Stanley Presser, and Miron Straf also 
provided helpful comments on the report. None of these 
reviewers, of course, is responsible for the conclusions 
of the panel's study, but we are grateful for their 
thoughtful assistance. We are also indebted to Chris 
McShane and Elaine McGarraugh for their patient help in 
preparing this manuscript. 

The panel's work was aided by the Gallup Organization, 
the Opinion Research Corporation, the Washington Post 
poll, the General Social Survey program of the National 
Opinion Research Center, and the Question Form, Wording 
and Context Project of the Survey Research Center at 
Michigan. These organizations used their own resources 
to conduct special experiments that assisted the panel in 
its work. 

Finally, the editors wish to acknowledge their debt to 
Eugenia Grohman, who helped them reorganize, revise, and 
survive this report. 

Charles F. Turner and Elizabeth Martin 



NOTE: In the text we use both masculine and feminine 
pronouns for indefinite references. 
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Hasn't it often occurred that instruments 
originally invented for record and computation 
have inadvertently so extended the concepts 
of the entity they were invented to measure 
(concepts of space, etc.) in the mind and 
imagination that employed them, that they may 
metaphorically be said to have extended the 
original boundaries of the entity measured? 

Hart Crane 



Introduction 



Our central concern in this study was the poll or sample 
survey as used both in the social and behavioral sciences 
and in journalism, marketing, and politics. We followed 
fairly closely the usage of a committee of the American 
Statistical Association that recently issued a useful 
pamphlet What Is a Survey? (Ferber et al., 1980): 

Any observation or investigation of the facts 
about a situation may be called a survey. 
But today the word is most often used to 
describe a method of gathering information 
from a number of individuals, a "sample," in 
order to learn something about the larger 
population from which the sample has been 
drawn. Thus, a sample of voters is surveyed 
in advance of an election to determine how 
the public perceives the candidates and the 
issues. A manufacturer makes a survey of the 
potential market before introducing a new 
product. A government agency commissions a 
survey to gather the factual information it 
needs in order to draft new legislation. For 
example, what medical care do people receive, 
and how is it paid for? Who uses food stamps? 
How many people are unemployed? [p. 3] 



NOTE: Throughout this summary report, references to 
Volume 1 of Surveying Subjective Phenomena are made 
by noting the relevant chapters; references to 
Volume 2 are made by noting the authors of the 
individual papers. 



Surveys of human populations also provide 
an important source of basic social science 
knowledge. Economists, psychologists , 
political scientists and sociologists obtain 
foundation or government grants to study such 
matters as income and expenditure patterns 
among households, the roots of ethnic or 
racial prejudice, comparative voting 
behavior, or the effects of employment of 
women on family life. [p. 5] 

Among surveys, we were especially interested in those 
surveys, or portions of surveys, that deal with 
subjective phenomena. (For a review of surveys in 
general, see Tanur [1981].) 



OBJECTIVE AND SUBJECTIVE PHENOMENA 

Subjective phenomena are those that, in principle, can be 
directly known, if at all, only by persons themselves, 
although a person's intimate associates or a skilled 
observer may be able to surmise from indirect evidence 
what is going on "inside." Objective phenomena are those 
that can be known by evidence that is, in principle, 
directly accessible to an external observer. Often that 
evidence is actually a matter of record, although the 
relevant records may not be easily sampled for the 
population of interest. 

Thus, if an individual is asked to name a favorite 
author, to state whether the draft system is fair, to 
indicate how many children she wants to have, to identify 
an area in which she or he is afraid to walk at night, or 
to say whether he or she has ever wished to belong to the 
opposite sex, the information sought is subjective. But 
if an individual is asked whether he has served in the 
Armed Forces, how many children she has ever borne, how 
far she or he lives from the place of work, or to state 
his or her sex, the information is objective, even though 
its measurement may rely solely on the respondent serving 
as an informant rather than on records or reports of 
observers. When the accuracy of that measurement is 
questioned, its validity may be tested by referring to 
records or seeking corroboration from other informants. 
In household surveys, one respondent may often serve as 
informant for all members of the household, even for such 
personal but observable behavior as smoking habits, but 



such proxy reports ordinarily would not be accepted for 
subjective phenomena. 

We recognize that for some subjective phenomena there 
is the possibility of substituting behavioral observations 
for direct quest ions , particularly when there is presumed 
reason to distrust the latter. Economists, for example, 
tend to rely on the principle of "revealed preference," 
letting actual behavior in the marketplace, such as 
purchases of goods and services, serve as an indicator 
for tastes or preferences (see Scitovsky, 1976) . Some 
critics of survey research have urged that greater use be 
made of "unobtrusive measures," such as taking inventory 
of the contents of garbage cans, in place of asking people 
about their preferences for various foods (particularly 
alcoholic beverages) , on the grounds that biases 
excessively compromise the validity of respondents' 
answers (Webb et al., 1966). 

However clear the distinction between objective and 
subjective phenomena may be in principle, there is much 
blurring in practice. Whether a person is "unemployed," 
under the definition currently used by federal 
statistical agencies, turns in part on whether he or she 
was "seeking work" during a specified period, and no 
proof of overt activity (such as visiting an employment 
office) is required to support the respondent's (or proxy 
respondent's) definition of "seeking work." There are 
many such "guasi-facts" for example, ethnic group 
membership, condition of housing, and being the victim of 
a criminal act that similarly allow latitude for the 
respondent's definition of the criterion. For the sake 
of economy, surveys often obtain what are at best 
estimates or at worst mere guesses about matters that are 
in principle factual or objective, such as how many times 
the respondent attended religious services in the 
preceding year or total income of the members of the 
household. 

With these considerations in mind, we note that the 
realm of the subjective can enter into a survey in two 
distinct ways: the information sought may, itself, 
pertain to subjective phenomena (rather than factual or 
objective phenomena) ; or the method of securing 
information may call for an estimate or impression 
reported by someone in a more or less favorable position 
to know the facts rather than relying upon objective 
determination by some well-described procedure (see 
discussion in Fienberg and Goodman, 1974:74ff) . Our main 
concern is with the first category: phenomena that are 



intrinsically subjective. As defined above, such 
subjective phenomena are, in principle, directly 
observable only by respondents (subjects) themselves. 
However, in view of the importance of "guasi-facts" in 
surveys and the enormous possibilities for distortion of 
reports of objective phenomena, we also give some 
attention to these hazards of surveys (see Chapter 5 and 
Bailar? Smith; and Newman, in Volume 2). 

There are at least three general uses for data on 
subjective phenomena, uses that lead survey organizations 
to collect such data on a large scale. First, if one 
wants to forecast some class of behavior, a possible 
strategy for doing so is to ask people about their 
intentions or plans whether they intend to go to 
college, whether they expect to purchase automobiles in 
the next several months, whom they plan to vote for, how 
many children they expect to have, and so on. One can 
also try to get at the dispositions or motives underlying 
such decisions and use the evidence of these attitudes to 
project the size of a market or the outcome of an 
election. Second, many (though not all) students of 
politics, economics, demography, and social behavior work 
with concepts and theories in which subjective phenomena 
constitute key explanatory variables. For such work, 
data on subjective phenomena are needed to understand the 
behavior for which explanations and theories are sought. 
Third, some researchers and writers on social indicators 
stress the importance of subjective social indicators in 
any attempt to develop comprehensive social reports on 
the quality of life (Campbell, 1976? Campbell et al., 
1976; see also Bureau of the Census, 1980, for an example 
of the use of subjective measures in a social report) . 
In the United States, of course, the Declaration of 
Independence affirms the "pursuit of happiness" as one of 
our rights. One way to inquire how well we are doing in 
maintaining that heritage is to investigate the state and 
trend of "happiness." Moreover, in order to interpret 
social indicators based on objective data, subjective 
data are also needed to disclose the goals that people 
may have in mind: for example, how much pollution is 
"too much"? 



SUBJECTIVITY AND "SOCIAL FACTS" 

Since the notion of subjectivity in survey measurements 
is critical to our inquiry, further analysis of that 



notion is useful at this point. As an example, suppose 
each of the following is a true statement: (a) The 
planet Saturn has rings, (b) The X family had a total 
money income of $27,500 in year 1980, (c) Mr. X, when 
asked in a survey, "What is the smallest amount of money 
a family of four needs to get along in this community?" 
replied, "$250 a week." 

Statement (a) is clearly a matter of objective fact, 
although any knowledge of how such a fact is established 
and comes to be scientifically accepted will concede a 
role to human perception and interpretive processes, both 
of which may be influenced by social interactions among 
observers and analysts. But, however great the role of 
the human factor in this sense, one would not consider 
statement (a) to be one about subjective phenomena. 

Statement (b) in one sense is a similarly objective 
fact in the sense that, in principle, relevant records 
might be produced that would be similarly interpreted by 
competent observers. Although one often uses someone 
like an adult member of the X family as an informant 
concerning the family's income and does not call in 
other observers or have recourse to records of 
transactions that is a matter of convenience rather than 
principle. But, still, there is a difference between (a) 
and (b) . Both concern objective phenomena. But fact (b) 
is a social fact, which is to say that its meaning 
incorporates implicit assumptions about society that are 
shared by some more or less extensive group of people. 
Part of the implicit meaning of statement (b) is that a 
world of purveyors outside the X household is prepared to 
exchange various goods and services for their dollars; 
otherwise there would be no such thing as "income." The 
phenomenon itself has an irreducible subjective aspect 
(Parsons, 1937:46). But inasmuch as one can count on the 
willingness of people to exchange goods and services for 
money, money has an agreed-upon meaning, and the units in 
which it is measured as well as the documents (currency, 
etc.) involved in exchange are likewise a matter of 
agreement. Hence, income is an "objective" phenomenon in 
much the same sense that the word "mason" (one who builds 
with brick, stone, or the like) has an objective meaning. 
In both instances, the meanings are carried by people, 
but the similarity of meanings carried by different 
people is great enough so that the implicit subjectivity 
may escape notice. 

In statement (c) , we are, ostensibly, introducing 
still a different sort of subjective element. At least 



for a casual observer (such as an interviewer) , there is 
no way to find out what Mr. X thinks is enough money to 
get by on but to ask him. Or, at any rate, if one 
attributes some amount to him, he can deny the attribution 
without any possibility of being shown up as a liar (while 
if he lies too much about his own income, the Internal 
Revenue Service may well show him up) . So, on the face 
of the matter, statement (c) concerns a subjective 
phenomenon in the fairly strict sense of the word we 
suggested at the outset: subjective phenomena are those 
that, in principle, are directly observable only by 
persons themselves. But there is a bit more to be said. 
The Gallup Poll has often used the question cited in 
statement (c) since 1946, and the resulting data have 
been studied by Rainwater (1974:Table 3-4). He computes 
the mean (average) of all responses in each year's survey 
(Yt, for the t-th year) and compares it with the 
national disposable personal income per family (X^) 
computed from national income statistics. Data are 
available for 18 years in the 1946-69 period. For those 
18 years Rainwater finds that the linear function 

Y t = .515 X t 

provides an excellent fit to the observations (R 2 = .967) . 
Over the period in question, real income actually 
increased more than 50 percent in the United States. But 
each year's progress in real income was matched by an 
increase in the amount perceived as "what a family needs," 
so that the proportionality of the foregoing equation 
tended to remain stable. Much other evidence (see Chapter 
6) supports the proposition that people's sense of what 
is "needed" or what is "satisfactory" depends on the 
prevailing level of income, and should the latter change, 
the perception or standard soon adjusts to the new level. 
Hence, although Mr. X is indeed speaking for himself and 
reporting a subjective phenomenon, that phenomenon itself 
is shaped by a powerful social consensus, or what Durkheim 
(1895/1938 :XLIX) termed "collective representations." It 
should be noted that the several hundred Messrs. X who 
respond to the Gallup Poll need not be aware of the social 
pressure that is shaping their standards of living and 
that there is room for idiosyncratic variation in their 
concepts for structuring of individual consciousness by 
the social milieu. Statement (c) , then or, rather, the 
distribution of a set of such responses is an excellent 
example of both a "social fact" in Durkheim 1 s sense and a 
subjective phenomenon in our terms. 



Consider a fourth hypothetically true statement: (d) 
Mrs. X, when asked in the General Social Survey, 1 "Taken 
all together, how would you say things are these days 
would you say that you are very happy, pretty happy, or 
not too happy?", said, like somewhat more than half of 
all respondents, "pretty happy." The preceding discussion 
will have alerted us to the possibility that there is 
more social content to her response than might appear at 
first sight. No doubt the notion of what happiness means 
is one that Mrs. X has largely acquired from her milieu, 
and she may well ask herself how a person of her age, 
sex, family status, economic level, community and 
neighborhood location, and so on, would be expected to 
respond when "things" are "taken all together." Moreover, 
there is something very suggestive of a norm of response 
in the fact that the modal category is the middle one and 
the response distribution shows no clear trend from year 
to year. Still, the question invites a personal response. 
The referent is not a somewhat hypothetical "family of 
four" (which might differ appreciably from the size of 
the X family itself) but the respondent "you" herself . 

The happiness question is indeed prototypical of what 
advocates of subjective social indicators propose in 
calling for "direct monitoring of key social-psychological 
states of the population" (Campbell and Converse, 1972:9). 
These writers cite perceptions of complex social 
situations, expectations, hopes, and frustrations of 
members of the society as basic "subjective data." And 
Campbell (in Campbell and Converse, 1972:442) proposes a 
concern with the quality of personal experience and cites 
as relevant aspects thereof the frustrations, 
satisfactions, disappointments, and fulfillment that must 
be assessed or evaluated by the people experiencing these 
feelings or reactions if they are to be ascertained at 
all (except in the very indirect way that Durkheim 
[1895/1938:8] proposed, i.e., via the analysis of rates 
of marriage, birth, and suicide). 2 

There are traps here for the unwary. To be sure, when 
"social-psychological states" are at issue, the individual 
presumably is the uniquely well-qualified informant. But 
the fact that respondents are generally compliant and 
quite willing to engage in introspection does not mean 
that they do their job well. On some topics a person may 
be a poor judge of his own motives, and thus it is no 
good asking him how his mind works (Nisbett and Wilson, 
1977; but see also Ericsson and Simon, 1980). 
Presumably, if disposed to do so, a respondent can tell 



the interrogator what she intends to do or whether she 
has made up her mind about something, but not necessarily 
why she'll do it. And while a respondent can no doubt 
state what he wants in the short run, it is most 
dangerous to assume that he'll be satisfied if he gets 
it. 

Can any person really assess how much affect or 
emotion suffuses his definition of a situation, e.g., how 
"strongly" he feels about some public issue? The case is 
not proven. 3 And certainly there are serious problems, 
as careful analysts have noted, with data collected by 
means of the "vague quantifiers" that are often attached 
to survey questions to provide "measures" of emotional 
intensity (Bradburn and Miles, 1979). Indeed, Ericsson 
and Simon's (1980:215) caveat about the use of verbal 
report data in psychological experiments applies equally 
well to survey measurements of subjective phenomena: 

Accounting for verbal reports, as for other 
kinds of data, requires explication of the 
mechanisms by which the reports are generated, 
and the ways in which they are sensitive to 
experimental [or survey] factors 
(instructions, tasks, etc.). 

This point may be well illustrated by contrasting a 
survey report that reads, "Most people say they favor 
handgun control," with one proclaiming "most people favor 
handgun control." Such a simple insertion brings strong 
emphasis to the limitations of the method of inquiry, and 
it alerts one to the problems that can arise in using 
such inquiries to predict behavior. (For an example, see 
Schuman and Presser's [1977-78] attempt to reconcile 
public support for gun registration in national surveys 
and legislative inaction on the same issue.) 

The extent to which the answer to a question calling 
for introspection is valid cannot be known a_ priori; the 
fact that only the subject herself can know if she is 
happy does not mean that her report on happiness 
constitutes a trustworthy scientific datum. A persistent 
theme of the critics of research on attitudes and opinions 
(see the classic statements of McNemar [1946] and Blumer 
[1948] ) has been precisely that survey researchers too 
often take the easy way out. As a result, the research 
is superficial in regard to both the psychological 
structuring of responses and the social determinants of 
individual dispositions. 



As in all scientific inquiry, there is a paradox of 
observation: to collect the data to be used to test a 
theory one must presuppose a lawfulness of the behavior 
involved in making the observation (Reiss, 1980) . "Like 
all existential dilemmas in science . . . the paradox is 
resolved by a process of approximation" (Kaplan/ 1964:54) 
or iteratively. The more one understands about behavior/ 
the more reliable and valid will be the data collected, 
and the better the data, the more their analysis will 
enhance understanding. 



BACKGROUND OF THE STUDY 

Questions about the reliability, validity/ and relevance 
of survey measurements are not new, and we are not the 
first committee to study those questions. In 1945, the 
National Research Council (NRG) joined with the Social 
Science Research Council to establish a Joint Committee 
on the Measurement of Opinion, Attitudes and Consumer 
Wants. With financial support from the Rockefeller 
Foundation, a committee of distinguished social 
scientists, statisticians and practitioners was 
established under the chairmanship of Samuel 
Stouffer. Among the major topics identified by this 
group at its first meeting was "the validity of 
statements, opinion and information furnished by 
respondents." The minutes of this group's first meeting 
go on to note that " [this] was considered to be a 
difficult and long-range problem but a fundamental 
one for the committee to be concerned with." 5 The four 
publications directly stimulated by the committee did 
not, however, concentrate on this question. Rather, they 
made important contributions toward codifying the theory 
of sampling (Stephan and McCarthy, 1958) , examining the 
effects of the personal interview situation on survey 
responses (Hyman et al. , 1954), and studying the problems 
of inference that arise in longitudinal studies 
(Anderson, 1954; Lipset et al. , 1954). 

The recent revival of interest in the "fundamental" 
issues reflects the confluence of several forces. 
Probably most important of these has been the apparent 
success of pollsters and survey researchers in making 
converts. Surveys of subjective phenomena have become an 
ubiquitous component of social life in America. Not only 
do the press and private industry routinely quote their 
results, but the federal statistical system has become 
increasingly committed to the use of survey measurements 



10 

of opinions and attitudes. Historically, official 
statistics in the United States were largely the domain 
of demographers/ economists, and statisticians. Inquiries 
made by the Bureau of the Census, for example, generally 
focused on assessments of the size and distribution of 
the population and a variety of other phenomena that are, 
at least theoretically, objective, i.e., amenable to 
independent corroboration, such as age, income, and 
characteristics of industrial establishments. In recent 
years, however, published national statistics have come 
to include a growing complement of social statistics 
designed to measure explicitly subjective phenomena (see 
Executive Office of the President, 1973; U.S. Department 
of Commerce, 1977; Bureau of the Census, 1980). One of 
those recent volumes argued that such measures provide a 
vital addition to traditional statistics (U.S. Department 
of Commerce, 19 77: XXVI) ; 

The basic reason for including such 
[subjective] measures in this report despite 
the difficulties in their interpretation is 
that they offer a vital dimension in 
developing a comprehensive description of the 
condition of our society and the well being of 
its members. The bulk of the information 
presented [in this report] relates to people's 
objective situation or condition their jobs, 
their incomes, their health status, etc. The 
main purpose of the attitudinal measures is to 
provide some insight as to how people perceive 
certain aspects of their conditions. Such 
data are an essential source of information on 
people's values and aspirations. 

For similar reasons, the recent series of reports by the 
National Science Board (1973, 1975, 1977) on the state of 
science in the United States has incorporated data on 
public attitudes toward science and technology. 6 

Similar trends are also evident around the world. For 
example, since 1970 the British government has issued an 
annual report, Social Trends, that pays considerable 
attention to measures of subjective phenomena. Indeed, 
the authors of this report have argued that "the way 
forward lies not in adding more measures of conventional 
hard statistics, but rather in supplementing the existing 
ones by adding ... a dimension of the satisfaction 
(happiness, contentment, psychological well-being, etc.) 
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felt by those who constitute the community" (Abrams, 
1973:36). A recent United Nations report (1975:32) 
echoes this concern. 

This growing use of survey measurements of subjective 
phenomena has again raised questions concerning the 
validity and reliability of such measurements. 
Developments or events in four specific areas in 1976-78 
served as immediate catalysts for the present study. 

Measurements of Public Confidence in National 
Institutions. In 1976-77 an attempt was made to use a 
series of measurements of public confidence in the 
leaders of national institutions in an NRG report 
(Kiesler and Turner, 1977; Turner and Kiesler, 1981) , but 
comparison of allegedly equivalent measurements made by 
two survey organizations showed substantial discrepancies 
in both the levels of reported confidence and the trends 
across time in those measurements (Turner and Krauss, 
1978; Smith, 1981). While the data were not included in 
the NRC report, they were widely reported in the media 
and were incorporated in such federal publications as 
Social Indicators and Science Indicators. Indeed, even a 
recent presidential commission both used and republished 
these data without commenting upon their fallibility 
(President's Commission on a National Agenda for the 
Eighties, 1980:103) . 

Measurements of Public Attitudes Toward Science. 
Surveys of public attitudes toward science that have been 
commissioned by the National Science Board for the Science 
Indicators series have been the subject of considerable 
discussion. Questions have been raised (e.g., Mac Lane, 
1978; La Porte, 1980), on technical grounds, that some 
survey questions are misleading or biased and, on 
philosophical grounds, that the surveys reify public 
opinion. There are doubts, too, as to whether most 
people have definite attitudes and knowledge about 
science. In addition to these questions, evidence has 
emerged that the Science Indicator surveys contained 
significant anomalies. For example, the responses across 
time to a question purporting to measure public support 
for spending on science and technology suggested a very 
sharp decrease between 1974 and 1976. This result was 
not consistent with other measurements made during the 
same period. A change in the content and ordering of 
survey questions in the 1978 Science Indicators survey, 
however, created a likely (although unproven) artif actual 
explanation for this decline (see Turner, Volume 2). 
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Happiness and the Quality of Life. Indicators of 
public happiness, satisfaction, and the perceived quality 
of life have recently been the subject of serious 
analysis in economics, sociology, and psychology (see, 
for example, Easterlin [1974] ; Davis [1975] ; Campbell et 
al. [1976]; Andrews and Withey [1976]). A question 
asking respondents about their "happiness" has been 
employed as a reference criterion for validating 
developmental work on subjective indicators of the 
quality of life. However, as Figure 1 illustrates, the 
trends obtained by different survey programs over time 
were not always convergent. 7 While plausible 
explanations have been advanced to explain these 
discrepancies, the discrepancies themselves and the 
speculativeness of the explanations raised questions 
about the meaningfulness, reliability, and error 
structure of these widely used indicators. 

Survey of University Faculty. A 1977 survey of 
American professors (Lang, 1978, 1981) provoked a 
spirited public controversy concerning the adequacy, 
accuracy, and appropriateness of survey measures that 
purported to reflect the respondents' attitudes and 
opinions. Many of the survey questions were attacked as 
"ambiguous," "meaningless," or tending to "prejudice" the 
issues. 



The problems illustrated by the above examples all 
appear to involve the influence of nonsampling factors on 
survey results. 9 It was unclear at the outset of this 
study whether our understanding of the potential effects 
of such nonsampling factors in survey data was sufficient 
to specify and control the relevant factors in the 
future. Nor could one be confident that policy makers 
and other users of such survey data appreciated the 
possibility of systematic errors in such measurements and 
the implications that such errors might have for their 
own decision making. Indeed, there were even doubts as 
to whether "error" was an appropriate concept to use in 
the discussion of most of these problems. 

It was against this background that the present study 
was launched. Our aim was to stimulate improvements not 
only in the practice of survey research and the under- 
standing of the processes which produce its measurements 
(and its errors) , but also to stimulate improvements in 
the use of research findings by the general public and by 
clients with all kinds of specialized interests. 
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FIGURE 1 Survey Measurements of Self-Reported Happiness 
Made by Three Survey Research Programs 

NOTES: Estimates are derived from sample surveys of the 
noninstitutionalized population of the continental United 
States age 18 and older. Error bars demark approximately 
1 standard error around sample estimates (assuming 
sampling variance for these clustered samples to be twice 
that for simple random samples) . 

Although the questionnaires show a minor divergence in 
the wording of this question, data from these two sources 
have been treated as a unitary time-series in other 
publications (e.g., Campbell et al., 1976; Andrews and 
Withey, 1976) . 

QUESTION: "Taken all together, how would you say things 
are these days would you say that you are very happy, 
pretty happy, or not too happy?" (NORC) 

"Taking all things together, how would you say things 
are these days would you say you're very happy, pretty 
happy, or not too happy these days?" (SRC) 

SOURCES: NORC General Social Survey (GSS) , Codebook, 
1972-74; NORC Continuous National Survey (CNS) [cycles 1 
and 10 + 11]. See Turner, Volume 2. SRC national data 
from time-series presented in Campbell et al. (1976). 
Survey dates from Campbell et al. (1976) ; Andrews and 
Withey (1976) ; and J. Varvra (personal communication) . 
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PUTTING SURVEYS AND THEIR PROBLEMS IN PERSPECTIVE 

A large amount of survey data concerning subjective 
phenomena is generated more or less continuously. At the 
same time, strong skepticism about the value of such data 
is expressed from time to time. And even convinced users 
of these data are likely to acknowledge problems in 
demonstrating their reliability, validity, and relevance. 
All experienced research workers are aware, for example, 
that the proportion of respondents giving (say) a "yes" 
response on some question can sometimes be modified 
considerably by altering the wording of the question. 
Sometimes this knowledge is used in a deliberate way to 
bias the outcome of a poll, and when this occurs, or is 
alleged to occur, the news may discredit the entire field 
of survey research and the polling industry as a whole. 
There is need, therefore, for greater knowledge about 
such matters, not only on the part of the public and the 
consumers of poll and survey results, but also on the 
part of survey technicians and analysts as well. 

There is a similar danger that the evidence of error 
in data about subjective phenomena that we review might 
lead to unwarranted and wholesale rejection of survey 
evidence on important social issues and problems or to 
ill-advised efforts to police the practice of polling 
(see, for example, the proposal by the Government 
Accounting Office, 1978). Indeed, looking at our litany 
of difficulties, some readers may come to wonder whether 
anything at all can be learned from such surveys or 
whether surveys have a useful role to play in social 
thought and policy. The following considerations have 
led us to answer "yes" to those questions. 



Fallibility of Measurement in Other Sciences 

As we previously noted, fallibility and error are not 
confined to the subjective realm, and public misunder- 
standing of such errors is often quite substantial (as 
evidenced by the controversy surrounding the population 
statistics of the 1980 census) . Furthermore, just as 
fallibility of measurement is not limited to the 
subjective domain, neither is it unique to survey 
measurements or social statistics. Recent reviewers 
(Hunter, 1977; Lide, 1981) have used the data presented 
in Figure 2 to demonstrate the variability that exists 
among measurements of such elementary physical phenomena 
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as the thermal conductivity of copper. Commenting on this 
figure, Hunter (1977:2) noted that "... although each 
analyst measured a physical quality that did not vary 
with location or time, it is clear that a remarkable 
variability attended the measurements." Moreover, he 
concluded (p. 2) 

The variation in attempting to evaluate the 
same physical constant is obvious. This 
example is not unusual. Similar plots of 
thermal conductivity as a function of 
temperature for approximately 400 common 
metals and material can be found in a 
supplement to the Journal (Ho et al., 1974). 
Nor is the observed variation in the 
measurement of "thermal conductivity" unique 
among physical parameters . . . 

Not only can physical measurements vary wildly, but 
even well-publicized "discoveries" in the physical 
sciences have sometimes been shown to be experimental 
artifacts. For example, between 1963 and 1974 more than 
500 articles in journals (including Science and Nature) 
discussed a supposed new substance: anomalous water or 
polywater. Although it resembled ordinary water, 
polywater was alleged to have a greater density, a 
reduced freezing point, and an elevated boiling point, 
among other anomalous properties. In the end, however, 
it was discovered that this "new substance" was nothing 
more than an impure solution of ordinary water (see 
Franks, 1981; Eisenberg, 1981) . 

Even measurements of geological events thought to be 
important precursors of major earthquakes have sometimes 
been called into question. For example, Jackson et al. 
(1980) recently published data that suggest that improper 
calibration of measuring devices accounts for the 
disturbing reports that there was a significant uplift of 
a portion of Southern California between 1959 and 1974 
and an apparent sinking between 1974 and 1977. 10 
Common biological measurements have also shown similar 
fallibility. 11 

While measurement problems abound in the nonsocial and 
nonsubjective realms, there are also some noteworthy 
institutional safeguards that have been developed to deal 
with such measurement fallibility. For example, explicit 
programs of interlaboratory research and coordination of 
measurements have been developed by analytical chemists 
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(see Youden and Steiner, 1975). Measurements are made 
with standard samples that should, in principle, always 
yield the same result. The variability of measurements 
across repeated measurements made under "identical" 
conditions (i.e., in the same laboratory, by the same 
scientist, using the same measurement procedure) , those 
made by different scientists, those produced in different 
laboratories, and those produced by different analytical 
procedures allow for a better understanding of the error 
structure of such measurements. Not only do such 
procedures permit a more realistic appraisal of the 
significance of interlaboratory comparisons of 
measurements but they also permit the identification of 
"rugged" analytical procedures, i.e., those that produce 
low variability across laboratories, trials, and 
exper imenter s . 

The systematic application of experimental procedures 
in a coordinated interlaboratory program of measurements 
provides one important paradigm that might be (but seldom 
has been) applied in survey research (see Chapter 5) . 
The physical and chemical sciences provide a further 
example in their compilations of extant measurements, 
e.g., the Journal of Physical and Chemical Reference 
Data, and in the role played in this work by the National 
Bureau of Standards (NBS) , which is concerned with the 
qualities and procedures of measurements of such standard 
units as the kilogram, minute, degree centigrade, etc. 
Its programs provide the decentralized laboratories of 
federal, state, local and private organizations with 
technical and operational guides to measurement 
procedures and tolerances (see Hunter, 1980). Since 
these measurements are often used in law, government 
regulations, and private contracts, the existence of an 
official reference bureau for such measurement is a 
practical necessity. 12 Since survey research does not 
presently have similar institutional mechanisms, we 
subsequently recommend a variety of institutional 
safeguards that may provide survey research with analogs 
to the quality control procedures developed in other 
areas of scientific measurement. 



Alternatives to Surveys 

Insofar as subjective phenomena are important to public 
opinion and public policy, people will find some way to 
ask and answer the treacherous questions that they often 
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entail. As a result, any suggestion of doing without 
surveys must consider what would come in their stead. 
Political commentators are one alternative. Rather than 
engaging in any sort of systematic sampling, they try to 
determine how "the people" view life, what people would 
answer if they were asked directly. Although the answers 
such commentators provide may reflect a sensitive 
understanding of the issues involved, these answers are 
inevitably colored by personal and political prejudices. 

Another alternative for doing without surveys is doing 
without verbal questions. Many economists argue, for 
example, that one should look only at behavior. In this 
view, life conducts its own surveys, presenting an 
unending series of problems that people resolve in 
actions that can be interpreted as revealing their 
underlying preferences or values. Although based on an 
appealing argument, this position neglects the ways in 
which life poses questions in incomplete and even 
misleading forms. Some people argue, for example, that 
the aim of advertising and marketing is to induce 
consumers to assess their own preferences in a biased 
manner. When information is withheld or misrepresented, 
the interpretation of people's "economic behavior" is 
complicated further. Even when the market formulates its 
questions and presents its options in a balanced and 
forthright manner, the resultant behavior reveals at best 
people's underlying values only to the extent that one 
subscribes to the economic model of human rationality. 
This model involves controversial prescriptive 
assumptions, incompletely substantiated descriptive 
claims, and numerous ad hoc assumptions, regarding, for 
example, people's beliefs about one another's bargaining 
power, their sophistication regarding governmental 
monetary policy, or the implicit constraints they place 
on the set of available options (see, for example, Meeks ' 
review of economic theories of utility in Volume 2) . 
Finally, analysis of observed behavior can provide only a 
tenuous guide to attitudes that have yet to be tested by 
(and expressed in) real-life decisions. 

Nonetheless, both of these alternative strategies have 
important contributions to make to the understanding of 
subjective phenomena. Commentators can, for example, 
offer creative hypotheses as to what bothers people, free 
from survey researchers' penchant for restricting 
discussion to issues for which data are available. The 
observer of behavior can discern something about how 
important various values are when the time comes for 
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action, as opposed to how important those values appear 
to be in the ruminations of people who have not yet been 
forced to act on what they say they want. Yet insofar as 
one wants to know what people think and believe about 
their world and themselves, there is no substitute for 
asking them directly. 

The knowledge obtained by asking is both unique and 
necessary, making it a vital complement to the insights 
provided by other procedures. And, therefore, 
improvement in the practice and understanding of the 
survey process is desirable. It is to this end that we 
have tried to summarize what is known, describe what is 
possible, and indicate, as best we can, those directions 
that may prove fruitful in the future. 



Summary of Findings 



THE USE OP SURVEYS 



Surveys are ubiquitous. In America it is estimated that 
more than 20 million survey interviews are conducted each 
year. The majority of those surveys include measurements 
of subjective phenomena/ such as people's attitudes toward 
products, political candidates, or government policies. 
And although one cannot be certain, it appears that the 
survey enterprise is growing. Technological innovations 
such as computer-assisted telephone interviewing are 
expected to spur that growth. A similar situation 
prevails in Great Britain, the one other country studied 
by the panel. 

The U.S. government is heavily involved in the survey 
enterprise, and it appears that the content of government 
surveys has become increasingly subjective. On the day 
we examined it, the government registry of current surveys 
contained 228 active surveys (excluding the decennial 
census); in total, these surveys required more than 5 
million interviews annually and more than 1 million hours 
of respondents' time. The majority of these surveys 
incorporated some subjective measurements, often to 
provide information to assist in the evaluation of 
government programs. 

inJlTT re f earch has also conie to Play an increasingly 
important role in the social sciences. Our survey (see 
Chapter 2 and Presser, Volume 2) of literature in 
sociology, political science, social psychology, and 

oTtT^[T ale ^ that betWeen one -<^rter and one-half 
In !~x J C ^ S , PUbllshed duri "9 1979-80 in core journals 
in each field (except social psychology) used survey 

dfsclplin. ^H^^ Were . fre <*uently subjective. In each 
discipline, the proportion of articles using survey data 
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had more than doubled in the last 30 years, and secondary 
analysis of survey data was also reported more 
frequently. 

While government and academic surveys account for an 
important and noteworthy part of the survey enterprise, 
they are a small part in comparison with the commercial 
sector. That sector includes surveys conducted as part 
of market research programs and to explore the 'public 
images' of major corporations. Such commercial survey 
research is almost entirely proprietary and thus most of 
it is unavailable for intensive scrutiny. But, we did 
learn, for example, that one survey program of the 
American Telephone and Telegraph Company conducts 
approximately the same number of survey interviews 5 
million annually as we found in the entire government 
registry. 

The one part of the commercial (i.e., nongovernment 
and nonacademic) sector that is in the public domain and 
available for systematic inspection is that performed for 
the news media. Although this makes up but a small 
fraction of the commercial sector, it is very important 
because of its rapid and broad dissemination by radio, 
television, and newspapers. Our own exploratory study of 
newspaper clippings indicated that in a 1-month period 
more than 200 million copies of stories mentioning polls 
were published in the United States and more than 80 
million were published in Great Britain. 13 These poll 
reports, not surprisingly, generally covered topics of 
current interest, for example, the popularity of 
political leaders, reactions to the state of the economy, 
voting intentions, and so forth. 

We should emphasize here that the availability of data 
and the distribution of experience of our members led 
this panel to concentrate on survey measurement as 
practiced in public arenas (including government and 
academic research and journalism) . We did not consider 
in depth the use of surveys in market research. 
Therefore, although many of our conclusions may 
generalize to market research surveys, it cannot be 
assumed that this will always be the case. 



ABUSES OF SURVEYS 

The survey method lends itself to various abuses. 
Selling products under the guise of "surveying" is a 
frequent practice, and the inclusion of pseudo-surveys in 
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direct mail fund raising often appears to be only a 
cynical device for increasing contributions , not a 
sincere effort to obtain people's opinions. Another 
frequent abuse is "surveys" in which the respondents 
volunteer to take part. During the 1980 presidential 
campaign, a major television network sponsored and widely 
reported such a pseudo-survey to assess the public's 
reaction to the campaign debate between President 
Carter and candidate Reagan. Such call-in shows are not 
open to criticism, per se, but claims that they provide 
an accurate reading of public reaction are unwarranted 
and misleading. 

Some surveys trade on the presumed scientific 
objectivity of the survey method but distort their own 
measurements. These misrepresentations of popular opinion 
are fed back to the public or to members of Congress in 
the apparent hope they will affect public or congressional 
opinion. One such example occurred during congressional 
debate over the establishment of a Consumer Protection 
Agency. Misleading survey results were widely reported 
by opponents of the agency to support claims that the 
vast majority of the public did not wish to see such an 
agency established (see chapter 3 and American Association 
for Public Opinion Research, 1976) . More recently, 
Mitchell (1980) criticized an advertisement placed in the 
Washington Post (December 4, 1979) by the American 
Nuclear Energy Council for "misrepresentation" and 
"outright error" in its reports of surveys of public 
opinion on nuclear power after the Three Mile Island 
accident. 

The selective disclosure of poll results provides one 
technique for manipulation. To remedy the effects of 
such distortions, public critiques and even instantaneous 
rebuttal polls have sometimes been used. The resources 
required to carry out such efforts are, however, often 
considerable, and so they are not readily available to 
all parties who may be aggrieved by polls that make 
self-serving claims to represent public opinion with 
biased questions or selective reporting of results. 

There have been efforts to encourage higher standards 
for survey practice and reporting. Currently, major 
efforts include attempts by the American Association for 
Public Opinion Research (AAPOR) , the Council of American 
Survey Research Organizations (CASRO) , and the National 
Council on Public Polls (NCPP) to require more adequate 
disclosure of survey methods and survey results for 
surveys that are presented to the public. These 
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organizations have required their members to assume the 
burden of publicly combatting distorted or selective 
reporting of survey results by their clients. The CASRO 
and NCPP standards also require that specified 
information on each survey including sponsorship, dates 
of interview, description of population sampled, sample 
size, and wordings of cited questions be included in all 
poll reports. 

In reviewing newspaper practices, however, we found 
that most poll reports do not now include the information 
required by these standards. This does not necessarily 
imply dereliction by the survey organizations, since they 
do not control what is printed by the press. It does 
mean, however, that the use of surveys by the media is 
open to manipulation in ways that cannot be easily 
detected because a great many news stories do not provide 
the information for example, sponsorship and question 
wording that would be required to discover potential 
biases. 

Because poll and survey results can be manipulated and 
because of the unequal distribution in society of the 
resources needed to conduct them, one must question the 
claim that these methods necessarily lead to the 
democratization of political and social decision making. 
Such an outcome would presuppose a universally high 
standard of technical competence and professional 
integrity on the part of all those who design, execute, 
report, and interpret poll and survey results. 



MEASUREMENT 

As part of its work the panel struggled to relate 
fundamental ideas of measurement to surveys of subjective 
phenomena. On first consideration, subjective phenomena 
appear to present unique problems. As in (almost) all 
surveys of human populations, the respondent typically 
knows that a measurement program is going on and this 
affects responses in ways that may be difficult to 
anticipate or correct. Moreover, surveys of subjective 
phenomena involve the measurement of characteristics that 
are directly observable, if at all, by only one person, 
the respondent. Thus strategies for verifying the 
accuracy of a measurement are inherently problematic. We 
record below a brief summary of our skirmishes with this 
problem. 
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Measurement and Error 

Measurement systematically assigns numbers to 
observations so that (particular) empirical relationships 
among the observations correspond to the mathematical 
relationships among the numbers assigned to them. All 
measurement models thus specify some pattern of 
postulated relationships. 

The empirical search for such patterns is complicated 
by the presence of error. An important area of 
investigation, therefore, concerns factors that disturb 
the measurement process. There are two general types of 
error in measurement: variability from one circumstance 
to another and discrepancy on average from an ideal 
measurement (or true value) . The term reliability is 
used to describe the degree of absence of the first type 
of error, variability. Reliability is usually assessed 
either by repeated measurements at separate times or by 
the internal consistency of several variants of a 
measurement at a single time. 

The term validity is used to describe the absence of 
the second type of error, discrepancy from an ideal 
measurement (or true value) . A measurement has validity 
if it is close to the true value. There are several 
kinds of validity, variously referred to as content (or 
face) validity, predictive validity, and construct 
validity. Content validity refers to whether the 
measurements seem on a common-sense basis to be what they 
are claimed to be. Predictive validity refers to the 
applicability of the measurement result to some 
consequence, for example, whether people who do well on a 
problem-solving test also do well on real-world problems 
like those on the test. Construct validity refers to 
whether the measurement is appropriately related to other 
measured variables, as specified by theory. 

The terminology used in the foregoing discussion comes 
from the general field of psychometrics and more 
particularly from mental testing. Problems of 
reliability, validity, the specification of measurement 
models, and the assessment of error are, however, part 
and parcel of all science, pure and applied, not only of 
survey research and psychometrics. We therefore 
consulted a broader literature dealing with both 
abstract, philosophical issues and specific, pragmatic 
problems. The philosophical literature is, in a word, 
meager? while there is ample writing on the idea of 
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measurement in idealized senses, there is little writing 
on measurement errors. 



Models of Measurement Variation 

The traditional statistical approach to error for metric 
variables- 1 - 4 was developed by specifying that the 
observed value is equal to the true value plus the 
error. Systematic error can be distinguished from 
dispersion, which is the tendency for observations to 
vary; dispersion can, in a sense, be controlled by 
taking additional observations. Systematic error in some 
cases is properly described as bias, the expected 
difference between the observed and true values. The 
validity of a measurement increases as systematic error 
decreases? its reliability increases as the dispersion 
(i.e., random error) decreases. 

In some problems, however, error patterns are not well 
described by this statistical approach. Unfortunately, 
there is little general theory of error structure for 
nonmetric variables, particularly for categorical 
variables with several ordered categories. Aside from 
formal problems (where suitable mathematical functions 
and distributions of the probability of error are 
specified) , a fundamental consideration of measurement is 
whether the mechanisms by which errors are generated can 
be understood using conventional models. 

The traditional statistical approach to measurement 
error is intimately related to the theory of sampling of 
human (and other) populations. Indeed, the concept of a 
true value for a measurement is relatively easy to defend 
and is useful if one thinks of it as a parameter 
(characteristic) of a finite population that is estimated 
by observing a sample according to an explicit procedure. 
We have reviewed some of the methodological work in this 
vein. Researchers at the Bureau of the Census, in 
particular, have been leaders in extending sampling 
models to incorporate various sources of variability. 
They have, for example, developed and applied experimental 
designs to estimate the proportion of the total variance 
accounted for by variability in response by individual 
respondents. They found that results differ greatly by 
type of item. For example, when it is difficult to 
retrieve factual information, response inconsistency is 
relatively high. How this kind of finding for factual 
items carries over to subjective variables is not obvious. 
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The Census Bureau studies also provide evidence on 
variability among interviewers. In general, questions 
with high response inconsistency also show high 
interviewer variability. Limited research on coder 
variability also suggests that it too differs by type of 
question. 

While the designs of these studies on sources of 
measurement variation (or reliability) are relatively 
straightforward, attempts to assess systematic error (or 
validity) are conceptually as well as operationally 
problematic. Two kinds of reinterview studies have been 
regularly done by the Bureau of the Census. One routinely 
provides an index of measurement inconsistency based on a 
repetition of the initial measurement. Less routinely, 
more highly trained enumerators conduct longer and more 
probing interviews on the assumption that such interviews 
produce a closer approximation to accurate answers, that 
is, have greater validity than the standard interview. 
For example, in a study relating to the census question, 
"What language, other than English, was spoken in this 
person's home when he was a child?," the reinterview 
collected details about who used the language, under what 
circumstances, how often, and whether English was also 
spoken. The results permitted calibration of the census 
results to the more elaborate intensity scale. This 
study thus shares some of the features of the proposal, 
often made and infrequently implemented, that standard 
survey interviews be supplemented with depth interviews 
of subsamples of respondents. 

We have also examined a body of evidence developed 
largely in the context of academically oriented surveys. 
This work takes as its point of departure the observation 
that responses to survey questions are frequently affected 
by what appear at first glance to be minor variations in 
question wording. Closer examination shows that in many 
such cases the variations are minor only in the sense 
that the number of words changed is small, the meanings 
of the questions having indeed been substantially altered. 
However, there are also several well-documented instances 
in which a change in wording appears to be substantively 
trivial but the effects on responses are appreciable. 
Table 1 presents an example of one such instance. (The 
results in Table 1 were replicated quite recently 
[Schuman and Presser, 1981] , indicating that the effect 
was not limited to the 1940 Roper Poll in which it was 
first discovered.) 

There are also instances in which the ordering of 
questions, or of alternatives within questions, leads to 
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noteworthy shifts in responses. Detection of such 
artifacts of order or wording requires experiments that 
are carefully designed, executed, and analyzed. Although 
some attention was given to these matters in the first 
decade or two of modern survey research, recent work has 
pointed up the desirability and feasibility of concerted 
programs of research on ways of detecting and controlling 
artifacts (see Chapter 5; Kalton et al., 1978; Schuman 
and Presser, 1981; Turner, Vol. 2). 

Analysts of survey data are aware of these problems 
and may attempt to cope with them in several ways. One 
general approach, widely followed by experienced analysts 
but often ignored in popular presentations, is to avoid 
taking univariate distributions very seriously, 
concentrating instead on relationships among responses 
and background variables such as education. Such 
relationships seem to be more resistant to variation in 
question wording and order than are single-variable 
distributions, although this is not always so. A second 
general approach is to work with sets of questions, 
turning them into multi-item indexes that may minimize 
idiosyncratic features of individual items. 



Table 1 Forbid and Allow Experiment 



"Do you think the United "Do you think the United 

States should forbid States should allow 

public speeches against public speeches against 

democracy?" democracy?" 



Forbid ("Yes") 
Not Forbid ("No") 


54% 
46% 


Not Allow ("No") 
Allow ("Yes") 


75% 
25% 


100% 


100% 



NOTE: Exact N's are not provided by Rugg, but information 
is given in Cantril (1940) . Each version of the question 
was asked of about 1,300 respondents; hence, the 
differences are highly reliable. "Don't know" responses 
did not differ significantly by the form of the question 
and are omitted here. 
SOURCE: From Rugg (1941) . 
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Replication of Subjective Measurements 

Other evidence about the quality of measurements of 
subjective variables in surveys includes five examples of 
different survey organizations conducting the same survey 
in tandem. These examples provided the possibility of 
making comparisons on a variety of measures to assess the 
similarity of results obtained by the different survey 
organizations. Unfortunately, in none of these 
enterprises was the published analysis as full or as 
thorough as one would like. 15 These experiments 
nonetheless establish the possibility in principle of 
achieving a high order of reproducibility between 
organizations in survey research. Of course, no 
demonstration could do more than that. An unknown part 
of the success of such experiments can be attributed to 
the extensive cooperation between organizations in 
standardization of training, field-work procedures and 
supervision, as well as the impact of knowing that an 
intersurvey comparison was being carried out. 

There is more extensive evidence from nonexperimental 
comparisons that are possible when two organizations 
fortuitously ask the same questions at about the same 
time. Much of the time these comparisons yield 
discrepancies no larger than those to be expected by 
chance (i.e., by sampling or other random fluctuations). 
Unfortunately, this is not always the case, and one 
cannot predict when this will happen or what the reasons 
for the discrepancies are when they do occur. 16 

As part of its work, the panel arranged several survey 
experiments to test hypotheses that arose from its review 
of nonexperimental survey comparisons. 17 Four 
experiments were conducted to test explanations for 
discrepancies in survey measurements of self-reported 
happiness (see Figure 1, above) . The results of these 
experiments indicated that these measurements were quite 
sensitive to the context in which the question was 
asked: namely, the preceding question(s). When the 
context was controlled (i.e., when the immediately 
preceding question was the same) , the results were 
equivalent; however, when the question context varied, 
the results obtained by different surveys varied 
considerably (see Figure 3) . Other experimental results 
(see Chapter 5) indicated that, although a context effect 
occurred with the question on general happiness, it did 
not occur for a more specific question asking respondents 
to evaluate the happiness of their marriages. 



29 



60 



50 



40- 



a. General Happiness Item 



30 - -L 




Uncontrolled Controlled 

MEASUREMENT CONTEXT 

FIGURE 3 Percentage of Respondents Saying They Were 
Very Happy in Response to Question Asking About Their 
General Happiness 

NOTES: Estimates are derived from surveys conducted by 
the National Opinion Research Center (NORC) , the Survey 
Research Center of the University of Michigan (SRC) , and 
the George Fine Organizaton for the Washington Post 
poll. (Error bars demark approximately 1 standard 
error around sample estimates.) All samples were 
restricted to married respondents who had a telephone in 
their residence. In the controlled context each 
measurement of general happiness followed an identical 
question asking about respondents' marital happiness? in 
the uncontrolled context, the general happiness question 
followed whatever else appeared in the survey, which 
varied from one survey organization to the next (see 
Chapter 5 for details) , 



QUESTION: All surveys used the NORC wording (see notes 
to Figure 1) . 
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This and similar evidence suggests that some questions 
may be particularly vulnerable to nonsampling errors, 
such as questions with vague or poorly described 
referents, those with little relevance to respondents' 
lives or of little interest to them, and those with 
imprecise response categories. A much larger body of 
experimental evidence, however, would be needed to verify 
such conjectures. We recommend continuing programs run 
in tandem with ongoing survey activities to provide the 
requisite data (see Recommendations 9 and 10) . 



Analysis of Subjective Survey Data: 
Measurement and Structure 

Survey design and analysis require even-handed attention 
to both the subjects (survey respondents) who answer 
questions and the objects about which questions are 
asked. In the past, much survey analysis was concerned 
with ways in which respondents differed among themselves. 
The analytical problem was to explain inter-respondent 
variation in response to a single item by cross- 
classifying the response distribution with relevant 
respondent characteristics. Recent advances in 
statistical methods for attacking this problem include 
developments concerning structural equation models 
(Goldberger and Duncan, 1973; Jo'reskog and Sfirbom, 1979) 
and logit models for categorical data (Goodman, 1978: Ch. 
1) . The panel chose not to review in detail these 
well-known advances, but rather to call attention to some 
research designs that go beyond the classical approach to 
survey analysis and that merit systematic attention. 



The Scaling of Objects 

Alongside the emphasis on respondent characteristics in 
classical survey analysis, there were a few important 
contributions concerning the scaling of objects, in which 
the characteristics and attitudes of individual 
respondents were assumed to be irrelevant or 
uninteresting. Perhaps the foremost examples are the 
surveys of occupational prestige (in which occupational 
titles are the objects) conducted every few years since 
the mid-1920s in the United States and extended in recent 
years to at least 60 countries. The results show an 
impressive degree of consistency across countries, 
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historical periods, characteristics of respondents, and 
design of the scaling procedure. This invar iance was 
unanticipated but theoretically provocative. 

Other object-scaling studies concern the seriousness 
of crimes, the social distance rating of ethnic groups, 
and the scaling of, adjectives that might be used to 
express strength of support for policies or officeholders 
or of statements that could indicate intensity of positive 
or negative attitudes toward some social entity. These 
examples tend to assume evaluation along a single 
dimension. But recent elaborations (see Rossi, 1979) 
open up the possibility that such a dimension itself may 
be composite and that the weights of its components can 
be statistically estimated. It appears that object 
scaling is an approach in which survey research can make 
productive use of appropriately modified techniques first 
developed in the laboratory setting. 

One question that arises when a set of objects has 
been subjectively scaled is how the scale values relate 
to other variables describing the objects. In the case 
of occupations, the subjective variable (prestige, or 
perceived social standing) has been shown to be closely 
related to such objective variables as the educational 
and income levels of persons in the occupations. In the 
work on spatial segregation of ethnic groups, the data 
suggest a strong correlation between (subjective) social 
distance and (objective) physical separation of 
residential areas. Another interesting example of the 
relationship between subjective and objective involves 
social class. The proportion of respondents calling 
themselves "middle class" has been shown to vary over 
time in a pattern that is fully explained by the 
concurrent trends in respondents' occupational levels, 
educational attainment, and income. 

It is not always true, however, that changing the 
objective circumstances of respondents produces 
corresponding changes in their subjective assessments of 
those circumstances. For example, Easterlin (1973, 1974) 
discovered that in the United States, reports of personal 
happiness as measured by a simple survey question do not 
show an upward trend over time in parallel to the growth 
of aggregate income. The different patterns of results 
for social class and happiness show that the relationship 
of subjective to objective variables may be complicated 
and is always theoretically problematic. (One of our 
recommendations proposes more research systematically 
exploring such relationships? see Recommendation 16.) 
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Models for Measurement 

While the development of theory and methods for sampling 
of human populations began to accelerate in the mid-1930s, 
so that by the early 1950s several major texts on sampling 
had been published, theory and methods of survey analysis 
developed more slowly. Part of the slowness was due to 
the lack of high-speed computing facilities in the early 
decades of survey research. But another part may have 
been the tendency for survey analysts to work in isolation 
from contemporary developments in theory and methods of 
statistical inference. We think such isolation should 
and will end in the near future. In considering the need 
to end that isolation, we reviewed some examples of newly 
developed methods and models for the analysis of 
categorical data of the kind typically collected in 
surveys dealing with subjective variables. 

The scaling model of Louis Guttman and the latent 
structure model of Paul Lazarsfeld are both more than 
three decades old, having been stimulated by survey 
research efforts during World War II. But their 
usefulness in survey analysis should be greater in the 
future than in the past because of recent progress on 
methods of estimation and testing goodness of fit. In 
addition to these well-known models, a great variety of 
models for response structures can now be handled 
rigorously and expeditiously. For example, models 
focusing on consistency of response across several items 
may prove useful in addressing such concepts as "issue 
consistency" or "issue constraint." Methods for 
analyzing sets of rankings of several objects include the 
fitting of models that assume general consensus on a 
unidimensional hierarchy of the objects (even when 
obscured by random variability of response) as well as 
models postulating more complex structures. Since some 
survey items have three or more response categories, it 
is interesting that the new models for categorical data 
include several that are specially adapted to such cases 
and that take advantage of the ordering of response 
categories in either of two situations: when the ordering 
is known ja priori or when an ordering, which is assumed 
to exist, must be inferred from the multivariate data. 

Survey data often give evidence of apparent response 
instability, as when a respondent appears to give 
contradictory answers to similar questions. Moreover, 
the "same" question can sometimes be asked in different 
ways, each of which yields a different response 
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distribution. In these circumstances it is natural to 
ask for criteria to use in deciding whether two or more 
questions really measure the same subjective variable. 
An analogous problem is faced in ability testing, and we 
looked at one of the latent trait models originally 
proposed in that context, that of Rasch (1960/1980) . 
His measurement model is attractive in that it provides 
exactly the evenhanded attention to subjects (or 
respondents) and objects (or items) that we advocated at 
the outset of this section. (Like any other model, of 
course, it accomplishes its goals at the expense of 
strong structural assumptions.) We suggest that this 
model be tried on attitude (rather than ability) items in 
the survey (rather than the mental testing) context. In 
this model, there is one parameter measuring the 
"attractiveness" of each item and one parameter for each 
respondent, measuring her or his propensity to respond 
"favorably" on the issue (such as approval of women's 
working) . The two parameters combine multiplicatively to 
produce the odds that a respondent will give a favorable 
response to a specified item. Assuming that items are 
answered independently (an adequate research design is 
needed to justify this assumption) , a joint probability 
distribution can be easily generated. The item parameters 
can be estimated and the model can be tested using 
straightforward extensions of methods for categorical 
data. Acceptance of the model entails the interpretation 
of the items as fallible measures of the same unobserved 
(latent) dimension. Although we know of a few small sets 
of questions for which the model apparently has a 
satisfactory fit to some survey data, the approach based 
on the Rasch measurement model (as far as survey research 
is concerned) is still in an early developmental stage. 
Whether it is likely to survive extensive testing, we do 
not know, but the model does exemplify how recent work on 
measurement can be as exciting to contemporary survey 
analysts as the work of the 1940s must have been to the 
statisticians then participating in the development of 
methods of survey sampling. 



SURVEY MEASUREMENT AS A PSYCHOSOCIAL PROCESS 

The panel also tried to use theoretical perspectives from 
several social science disciplines to conceptualize the 
interview as a psychosocial process. Our aim was to lay 
some of the necessary groundwork for a more integrated 
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theoretical approach to the survey measurement process. 
We focused attention on three topics: conceptual 
ambiguity in surveys, the role of respondents, and the 
question-and-answer process. 



Conceptual Ambiguity in Social Surveys 

Concepts used in social scientific theories have meaning 
not only in theory, but also in the real world. The 
theoretically appropriate definition of a social 
construct such as "public opinion," or "risk," or 
"unemployment" or "crime"--rests not only on scientific 
and measurement criteria, but also depends on how the 
concept is conceived and used in the world. 

Lay and scientific (or technical) concepts do not 
necessarily match. For example, lay definitions of 
unemployment probably do not fully correspond to the 
definition that is used by the Census Bureau in its 
Current Population Survey. When there is a difference 
between concepts as they are used by the public and as 
they are operationalized in surveys, published statistics 
and public debates about them are potentially 
misleading. In addition, if the concepts used in survey 
questions are not understood in the same way by the 
survey researcher and his or her respondents, then 
responses to the questions are likely to be misinterpreted 
by the researcher. A survey researcher, then, must be 
attentive to natural concepts and classifications and the 
relation of those to scientific concepts. (By contrast, 
physicists need not be substantively concerned with how 
physical conceptssuch as the quark are conceived by 
laymen, and, since they do not rely on self-reports as a 
source of scientific data, they certainly need not worry 
that quarks may not define themselves as physicists do.) 

The fact that many constructs in social science have 
social as well as scientific meaning creates several 
pitfalls for researchers. One such pitfall is a 
temptation to rely on implicit understandings of a 
construct, without invoking independent normative or 
scientific criteria to define it. An example of this is 
the fundamental concept of public opinion. 

In the classical view, "public opinion" does not 
consist of an aggregate of individual opinions. Ideas 
that have not been formed and tested through discourse 
and public debate, for example, would not be deemed 
worthy of being called public opinion. However, with the 
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advent of the survey method, this classical view was 
often lost sight of because survey researchers implicitly 
equated public opinion with whatever public opinion polls 
measured. The research community, in exploiting new 
avenues of data collection, bypassed many of the concerns 
that had occupied classical theorists of public opinion 
and instead became preoccupied with questions that can be 
answered by using sample surveys alone. Some maintain 
that in studying survey responses they are by_ definition 
studying public opinion. As Blumer (1948) emphatically 
pointed out more than a quarter of a century ago, 
however, such a stance provides only a semantic escape 
from a serious question. In order to use an instrument 
to examine an object, he pointed out, investigators must 
have an idea of what they are trying to get at in order 
to be able to understand the importance of what they 
see. Without an external framework for analysis, survey 
researchers cannot be sure that they study public 
opinion only that they study public opinion polls. 

Should public opinion be equated with public opinion 
poll results? A classical public opinion theorist would 
probably say "no," because much of what is reported in 
public opinion surveys is a projection of a respondent's 
inner psychic needs and has little to do with public 
affairs; is derived passively and uncritically from 
external sources and has not been subjected to serious 
debate or discussion; is a collection of unsupported 
views not necessarily in character with the individual's 
or society's world view and thus ephemeral in nature; and 
is entirely removed from consequential action, with 
people unlikely to translate their views into any action, 
particularly public action. Thus Blumer (1948) argued, 
surveys have not been designed to capture the processes 
by which issues are raised, debated, and resolved in 
public life. 

If public opinion analysts and reporters take their 
task seriously, they must begin to integrate, explicitly, 
other sources of information for a fuller portrait of 
public opinion. Such sources include information about 
communication processes, the centrality of an attitude in 
a psychological and cultural framework, and the 
implications of any particular view for individual or 
collective action. 

Even when a researcher's measurements are grounded in 
carefully defined theoretical constructs, however, 
another pitfall may exist if respondents do not share the 
researcher's conceptual framework. This has proved to be 
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the case in attempts to understand people's attitudes 
concerning risk. Most surveys of public attitudes 
regarding policy issues assume that issues of risk/ as 
presented by policy makers, are well formulated. Although 
this assumption may be fairly reasonable when one is 
dealing with familiar, articulated issues, it becomes 
questionable when an issue, and the language in which it 
is expressed, is rapidly evolving. In those situations, 
policy makers may be framing issues in an incoherent, 
inconsistent, or even illogical fashion, or they may be 
using terms without clear or shared meanings. In such 
situations, surveys may suggest to policy makers not only 
how to resolve issues, but also how to formulate them, 
indicating what options to offer, what terms to use, and 
what interpretations to make. For this potential to be 
realized, however, a survey researcher has to permit 
respondents to have a hand in defining what the issue is 
all about and allow respondents' concepts to be expressed 
in their own terms. 

One way of exploring differences between a researcher's 
and respondents' conceptual frames of reference is an 
approach known as ethnographic semantics. Anthropologists 
and psycholinguists have developed procedures that enable 
an investigator to determine the verbal coding schemes in 
which people understand and interpret their reality. The 
procedures can be used to conduct more rigorous and 
systematic pretesting for surveys than is presently done; 
they can also be integrated into the survey proper. (One 
example of the application of procedures based on 
ethnographic semantics to survey design is provided by 
Mauser [1972] , who used this technique in an attempt to 
develop a detailed model of the structure of voter 
preferences in an election.) 

Procedures based on ethnographic semantics might shed 
light on the effects of question wording variations, such 
as the finding that survey respondents are more willing 
to "not allow" speeches against democracy than they are 
to "forbid" them (see Table 1, above, and Schuman and 
Presser, 1981) . A careful analysis of the class of 
concepts related to "forbid" and "allow" (such as 
"restrict," "ban," "tolerate," "approve," etc.) might 
reveal that "forbid" and "allow" do not exhaust the 
domain of options with respect to antidemocratic speeches . 
For example, respondents who believe speeches against 
democracy should be restricted or hindered might 
reasonably believe that such speeches should neither be 
(completely) forbidden nor (completely) allowed. Such 
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respondents in a survey might disagree with both versions 
of the question in Table 1, and they might thus account 
for the difference in response distributions for the two 
questions. 

Ethnographic semantics may provide survey researchers 
with a useful tool in framing questions and response 
alternatives, although these procedures have not been 
widely exploited for this purpose. 



The Role of the Respondent 

It has long been recognized that social aspects of the 
interview situation may influence responses to survey 
questions. A respondent may wish to please or favorably 
impress an interviewer, for example, and hence give 
socially desirable responses, or a respondent may agree 
with an interviewer's questions out of deference to her 
or him. Yet there has been relatively little systematic 
analysis of respondents' interpretations of the interview 
situation. Evidence suggests that the purpose and nature 
of a survey are not usually very clear to respondents, 
who often do not know exactly what is expected of them. 
(See, for example, evidence presented by Cannell et al. , 
1977.) The lack of information leaves the respondent 
free to interpret the interview situation and his or her 
role in it in a variety of ways, some appropriate and 
some inappropriate. Respondents' role expectations are 
likely to vary according to the social characteristics of 
the interviewer; the expectations communicated by the 
interviewer in her opening remarks; the respondent's 
preconceived notions about the likely meaning and 
appropriate response to a stranger's request for personal 
information; and the influence of other people who may be 
present during the interview. These situational factors 
vary from one interview to the next (indeed, most survey 
organizations make little attempt to control them) , but 
we have only begun to understand their effects on how 
respondents define the interview situation and respond to 
survey questions. 

To achieve better control over situational sources of 
variability in responses to survey questions, there needs 
to be fuller investigation of respondents' understandings 
of their roles and the influence of different role 
expectations on the nature and quality of responses. 
Such investigations might be guided by several approaches 
and principles. First, situational variables should be 
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more carefully monitored or manipulated so that their 
effects can be assessed. Second, since situational and 
procedural variables influence responses because they 
have meaning to the respondent and the interviewer, we 
should not expect their effects always to be simple, 
constant, or general. A respondent's interpretation of 
the meaning of, say, an offer of monetary payment is 
contextually dependent. Perhaps for this reason, 
monetary incentives (and other survey procedures) often 
have inconsistent effects on the quality of responses. 
(One way of assessing how respondents define and react to 
the interview situation is through postinterview 
debriefing.) A thorough understanding of the properties 
of the survey interview as a measurement device cannot be 
achieved without considering the cultural context, for 
example, the norms and expectations that govern 
interactions between two strangers. One should not 
assume that opinions expressed during an interview would 
be similarly expressed by the respondent in other roles 
or other social settings. In order to understand the 
meaning and significance of opinions expressed to 
interviewers, we thus need to know how the respondent 
role is distinct from, and how it is similar to, other 
social roles in which people express or act on their 
attitudes and opinions. 



The Question-and-Answer Process 

Although question wording and order sometimes have 
substantial effects on response distributions, survey 
researchers often cannot predict when such effects will 
occur or explain them when they do occur. This suggests 
that the process by which respondents interpret survey 
questions and formulate answers to them is not yet well 
understood. 

It is probable that many artifacts represent systematic 
psychological phenomena that are not restricted to 
surveys. Therefore, the panel reviewed relevant 
psychological literature in order to formulate hypotheses 
about psychological sources of bias in the question- 
and-answer process. 

At the most elementary level, it is apparent that 
respondents cannot give meaningful answers if they do not 
understand a question. The interviewer's job is to 
communicate the meaning and intent of each question to a 
respondent and to elicit and record the respondent's 
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answer to each one. m doing so, the interviewer is 
governed by a set of rules about how questions are to be 
asked and answered and the sort of guidance she may give 
to help the respondent formulate an answer. From this 
perspective, errors result when the interviewer fails to 
communicate, or the respondent fails to understand, the 
intent of the question. The failure may occur because 
the interviewer performs her job poorly or because the 
rules that she follows are inadequate for eliciting 
complete and relevant answers. 

A second more subtle source of bias is the influence 
of a question on a respondent's subjective state. When 
we ask questions in a survey or in the course of normal 
conversation, we often assume implicitly that the process 
of questioning does not itself influence the subjective 
event (such as a memory, belief, attitude, etc.) that we 
are attempting to measure. There is, however, reason to 
question this assumption, for a question shapes not only 
the overt answer that is given, but also affects the 
subjective state on which the answer is based. A 
question is itself a source of information that sometimes 
alters the way a respondent classifies or labels the 
event in question, and a respondent's answer represents a 
verbal commitment thatdespite its seeming lack of 
important consequences can strengthen the respondent's 
belief or intention. Response variability can occur if 
the process of answering a question changes the 
subjective state to which the question refers. 

A third source of bias is contextual influences within 
sets of survey items. In a survey, prior items or item 
sets may influence responses to a subsequent item. This 
problem is usually phrased in terms of "consistency 
pressure." However, we must distinguish between 
constructed consistency, developed on the spot under 
pressure of the interview situation, and revealed 
consistency, reflective of the coherence of underlying 
prior attitudes. The former is art if actual; the latter 
is meaningful. An example of constructed consistency is 
provided by a split-ballot experiment conducted by Hyman 
and Sheatsley (1950) and replicated by Schuman and 
Presser (1981) . Respondents who endorse the right of 
American reporters to have access to news in Russia are 
more likely to accord Russian reporters the same right in 
America than respondents who are asked the "Russian 
reporter" question first. Review of the psychological 
literature suggests that constructed consistency is not 
likely to be a typical occurrence in interviews. The 
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risk is greatest when the target item is ambiguous, the 
questionnaire context suggests a very strong irrelevant 
response schema, and there is time for the respondent to 
think over the implications of the context item for the 
target item. Other factors dispose toward "contrast 
effects." One type of contrast effect is due to the 
influence of the range of stimuli available for 
comparison. For example, an assault is judged to be less 
serious when it is judged in the context of a homicide 
than when it is judged after another assault (Pepitone 
and DiNubile, 1976). Particularly vulnerable to contrast 
effects are item sets with interrelated parts, or sets in 
which a query about a part of an issue is followed by a 
general question about the whole issue. 

Finally, one may approach survey questions as 
cognitive problems that the respondent is asked to 
solve. Although the questions asked in surveys present 
respondents with a very diverse and complex set of 
intellectual tasks inference, recall, evaluation, 
introspection, etc. relatively little is known about the 
structure of the problems posed by survey questions or 
how respondents go about doing them. Psychological 
evidence suggests that respondents do not always answer 
questions as asked; they may instead transform a complex 
question into a simpler or more meaningful one, or they 
may employ various heuristics to arrive at an answer. 
Consequently, the answers to survey questions are 
frequently not as readily interpretable as one would like 
to assume. From this perspective, then, response errors 
result from the respondent's failure to carry out the 
response tasks, and the errors that characterize the 
responses are likely to vary systematically according to 
the nature of the intellectual tasks presented by the 
question {see Chapter 9). 

Our attempt to analyze the psychological and social 
dynamics of the interview process remains somewhat 
speculative. Our review suggests that survey biases 
represent systematic phenomena that, in various guises, 
have been confronted by scientists using methods other 
than surveys. Therefore, we believe that the tools and 
perspectives of various social science disciplines will 
prove fruitful when applied to survey measurement. 



41 
REPRISE 

The panel's review indicates that the survey enterprise 
is quite substantial in size, and it appears that the 
data it produces can sometimes have notable (and not 
well-understood) effects on the public. The collection 
and reporting of survey measurements of subjective 
phenomena , however, are not of uniformly high quality, 
and there exist numerous causes for concern both about 
common practices and also about the foundations of basic 
knowledge on which contemporary practices rest. 

In considering our review of the contemporary state of 
the survey enterprise, one should keep in mind that 
sample surveys of subjective phenomena are a relatively 
recent invention. They came into common use less than 
half a century ago. There is, thus, reason to hope for 
future improvement, and the panel believes that 
constructive criticism of deficient practices should be 
both encouraged and welcomed. Future improvements will 
have to be built upon careful considerations of negative 
(as well as positive) examples. It should also be borne 
in mind that survey measurements of subjective phenomena 
are not unique in confronting problems of error, 
unreliability, and conceptual ambiguity. Analogous 
problems beset factual surveys and physical measurements. 

It was in the belief that the future must and will 
produce improvements in the collection, analysis, and 
reporting of survey measurements of subjective phenomena 
that this panel came to consider ways in which such 
improvements might be encouraged. 



Recommendations 



We come now to the portion of our report in which we must 
translate our work into advice. We cannot offer a series 
of dramatic and novel recommendations; too many capable 
critics and practitioners have been over the ground before 
us for this to be a realistic aspiration. Moreover, 
almost every recommendation must carry a tacit disclaimer: 
this advice has not been adequately tested. But we can 
offer a series of recommendations that could 
significantly improve the survey enterprise. - 1 - 8 

Our recommendations are intended to advance three 
goals : 

To create an improved climate for the conduct of 

polls and surveys; 

To upgrade current survey practice; and 

To advance the state of the art of survey 

measurement and the scientific use of survey data. 

Most of the recommendations we offer pertain to more than 
one of these goals, but we use this list as a device for 
organizing the subsequent presentation. We begin, 
however, with our single most important recommendation, 
which applies equally to all three goals. 

RECOMMENDATION 1 Take surveys and polls seriously. 

Survey researchers should take surveys seriously in 
much the same way that physicists take particle 
accelerators seriously, astronomers take telescopes 
seriously, and space scientists take space vehicles 
seriously. There are, of course, recreational aspects to 
these tools. Using them for entertainment to a limited 
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degree is no doubt benign. Anyone who wants to can 
"poll" any others who are willing, and little harm may be 
done. But, just as one would not endorse the idea that 
just anyone can play doctor, administering any and every 
drug, one should not endorse the conceit that anyone can 
do or interpret a poll or that anybody's poll is 
necessarily as good as anyone else's. And we protest in 
the strongest terms when commercial, political, and 
social action groups carry out bogus surveys as a way of 
ingratiating themselves to their prospective customers or 
converts. 

How would life be different if this recommendation 
were generally accepted? There would be fewer polls and 
surveys that are manifestly inadequate to their stated 
purpose, ones carried out merely for the sake of a ritual 
or to postpone the making of a decision. One would not 
read extravagant or frivolous interpretations of poll 
results, presented so that a critical reader has no 
opportunity to make a contrary interpretation. Polls and 
surveys would be applied to serious social purposes and 
not used solely for the sake of public relations or the 
legitimation of decisions already taken. Moreover, the 
public would be supportive of competent and careful 
exercise of the art of surveying or poll-taking and 
capable of discriminating between standard and substandard 
performance. And, finally, resources would be husbanded 
so that fewer surveys and polls could provide more and 
better information. When polls are taken seriously, the 
public will no longer believe on the authority of a 
cynical pollster that you can come up with any result you 
want. 

We have not addressed the issue of how to implement 
the recommendation to take surveys seriously. But the 
reader should not jump to the conclusion that regulation, 
coercion, or other forms of ordering and forbidding are 
the only way. We intend, more deviously, to change 
people's minds and habits by insinuating better practices 
as models for the conduct, reporting, and use of surveys. 
Our subsequent recommendations propose measures that 
amount to taking surveys more seriously and count on 
competition to eliminate, in the long run, the inferior 
practices. 
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IMPROVING PUBLIC UNDERSTANDING OF SURVEYS AND POLLS 

In most of our recommendations we are seeking better 
surveys and polls that will be more effectively used and 
therefore enjoy greater public confidence. In this 
section, however/ we are concerned with the climate 
within which these improvements must come about. There 
are, of course, many interested parties: the various 
sectors of the survey industry itself (government, 
commercial, and academic); the many kinds of users, 
(journalistic, political, governmental, scholarly, and 
others) ; and the diverse groups addressed by the users 
and the industry. We neither anticipate nor desire an 
undifferentiated approval of or participation in all 
surveys. On the contrary, there must be vigorous debate, 
constructive criticism, and responsive actions concerning 
issues of purpose, content, and method in surveys. What 
we would hope for, however, is an elevation of the plane 
of discussion. Fortunately, we can be be quite concrete 
about what we are seeking, given the plentiful supply of 
published commentary on surveys and polls. 

We note first a good example of an attempt to educate 
the public, an unsigned article entitled "Opinion polls 
are [ ] accurate, [ ] slanted, [ ] don't know," in Changing 
Times magazine (March 1980). The article's lead reads: 
"As the political polling season swings into high gear, 
keep this rule in mind: Don't believe everything you 
read. Be especially suspicious of polls by groups with a 
definite point of view." The message is that the reader 
must avoid "uncritical acceptance of a single study." 
The article identifies sources of bias as the survey 
sample, the sequence of questions, and the questions 
themselves (as, for example, leading questions) . The 
tenor of the article is suggested by this excerpt: 

To make a decision on the merits of a poll, 
you need certain basic information: who 
sponsored the survey; when and how the 
interviews were conducted by telephone, by 
mail or in person (face-to-face polls are the 
most reliable, mail polls the least so) ; the 
exact wording of the questions; what part of 
the population was surveyed (registered voters 
or adults over 18, for instance) ; how big the 
sample and the subgroup were; which results 
were based on a subgroup (for example, all 
those aware of an issue, versus all those 
surveyed) and what the sampling error was. 
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We note also other efforts in this general direction 
(Gallup, 1972; Yankelovich, 1980; Cantril r 1980) 
including an important attempt to provide guidance to 
journalists in evaluating polls (Wilhoit and Weaver, 
1979). While some of these could no doubt be improved, 
they are examples of the kind of analysis that we hope 
will find its way into general circulation magazines and 
newspapers. 

Another generally good example of journalism is a 
series of articles by E.J. Dionne, Jr., in The New York 
Times in 1980. Somewhat more technical and detailed than 
the item just quoted, these articles cover a wide range 
of topics including the effects of question wording on 
the results of abortion polls, the effects of poll results 
on political campaigns, the growth of the industry, and 
the (limited) use of polls in the Soviet Union. 19 In 
contrast to these constructive appraisals which are by 
no means uncritical we take strong exception to the 
irresponsible approach taken in some other widely 
circulated articles. 20 While making some sound 
points such as the fallibility of election forecasts, 
the problem of nonresponse, and the over-reliance of 
political candidates and office holders on poll 
results these articles take an extreme position in 
advocating that the public refuse cooperation and provide 
"phony replies" to political polls. Refusal to 
participate in a poll or survey is, of course, a right 
that people can and do regularly exercise. However, 
indiscriminate and malevolent supplying of "phony replies" 
would damage all types of research regardless of its 
merit and might ultimately compromise important national 
measurement progams, such as the monthly surveys that 
produce national statistics on employment and 
unemployment . 

We hope that in the future irresponsible criticism 
would be supplanted by frank and more frequent public 
discussions by survey practitioners of the problems and 
progress of their profession. Our second recommendation, 
therefore, is directed to survey practitioners. 

RECOMMENDATION 2 Encourage more competent and informed 
public discussion of the faults and merits of polls and 
surveys . 

Survey practitioners have a responsibility to educate 
the public and to disclose fully their methods and 
findings, so that informed criticism and debate are 
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possible. Indeed, it is in the interest of reputable 
survey practitioners to encourage such debate in order to 
protect themselves from irresponsible or false allegations 
and to prevent the survey enterprise from being undermined 
by inferior or blatantly dishonest uses of surveys and 
polls (see Chapter 3). The recent example of the dubious 
use of a phone-in "poll" conducted by ABC News after the 
1980 presidential debate a "poll" that was criticized by 
ABC's own pollster (Sanoff, 1981) is unfortunately not 
an isolated case. We recommend that the professionals 
who conduct surveys and work with survey data contribute 
a significant portion of their professional effort toward 
the enlightenment of the public. 

This enlightenment would be advanced by the increased 
discussion in public by survey professionals of the 
faults and merits of particular polls and surveys. 
Critical symposia could have a salutary effect if the 
professional debates in this area were more frequently 
made public. As we noted above, journalists also have an 
important role to play as critics. The public should be 
educated on the importance of asking a range of questions 
on an issue and not relying on any single item as a 
measure of public opinion. 

To further progress toward public enlightenment, our 
third recommendation is directed to the relevant 
professional organizations, in particular the National 
Council of Public Polls, the Council of American Survey 
Research Organizations, the American Association for 
Public Opinion Research, and the survey methods section 
of the American Statistical Association. 

RECOMMENDATION 3 Secure the agreement of the various 
agencies that publish survey findings, and especially of 
the press and scholarly journals, to adhere to 
appropriate standards of reporting and disclosure. 

Reports of surveys and polls in the media often do not 
provide adequate information to allow informed evaluation 
by readers. Our exploratory review (see Chapter 3) 
indicated that even such elementary information as who 
sponsored or funded the survey and the wording of the 
survey questions is often lacking. Another reviewer of 
poll reports in The New York Times concluded (Paletz et 
al., 1980:505) : 2T 

with the . . . exception of polls conducted by 
(its own survey) unit, the Times stories 
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inadequately detailed the technical 
complexities or the possible deficiencies of 
most of the polls printed. ... it is our 
impression that the way methodological 
information about polling is reported in the 
media tends more to reassure than alert the 
audience about the possible defects of polling 
data. 

Professional codes of practice require that pollsters 
include in their reports a range of information 
including the sponsorship of the survey, dates of 
interview, sample sizes, complete question wordings, and 
so forth but the news media do not in turn report such 
details to their readers. Yet such information is 
essential for an informed evaluation of poll reports. 22 
We note that several British newspapers have entered into 
informal agreements setting minimal standards for the 
reporting of election polls. However, our analysis of 
poll reports in Britain indicates that these agreements 
do not appear to have had an effect on the reporting of 
other (i.e., nonelection) polls; the British poll reports 
were woefully incomplete in the majority of cases we 
examined. 

Consequently, our general recommendation has two 
specific implications: (1) survey researchers in the 
United States should make collective efforts to secure 
the agreement of the American media to disclose 
appropriate details of published polls; and (2) extant 
agreements in Britain (and any other nations having them) 
to disclose polling methods should be applied to all 
polls and should be carried out. We note that such 
agreements presuppose that survey methods and findings 
are fully documented. This is a presumption that applies 
to all scientific endeavour and certainly ought to be met 
in any poll or survey that wants to make a proper claim 
to public acceptance. 

Both survey researchers and journalists must bear 
responsibility for establishing a climate of opinion that 
makes such details newsworthy. It presumes little 
technical sophistication to point out that a newsworthy 
element of a report that, for example, "X% of the public 
supports expansion of the nuclear energy industry" is 
knowing whether the poll was commissioned by a pro-nuclear 
or anti-nuclear lobbying group. Yet such obviously 
relevant facts are not always reported to the public. 
Just as a competent reporter would regard the conflict of 
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interest of a public official as worthy of note, so too , 
poll reports should provide sufficient information to 
allow an evaluation of the integrity of polls. 

In tandem with the foregoing we address our fourth 
recommendation to the public. 

RECOMMENDATION 4 Discriminate among surveys according to 
the quality of their research procedures. 

Users of surveys those who sponsor, use, publish, or 
simply read about them in newspapers should be far more 
careful than they have been about the sorts of practice 
they will accept; they can become more selective 
consumers. While precise knowledge at the frontiers of 
methodological research is sometimes uncertain, there is 
a set of elementary norms about how surveys should be 
conducted. For example, sampling should be designed to 
guard against unplanned selectiveness, question wording 
should be carefully examined for special sensitivity or 
bias, interviewers should be thoroughly trained, 
questionnaires ought to be pretested, data analysis 
should be competent and clear, and survey methods and 
data should be documented and made available for 
independent examination. 

Given some fairly well-accepted practices and 
professional review (see Recommendations 5 and 6, below) 
users can evaluate, or have others evaluate, the quality 
of the procedures used to conduct a survey and thus the 
dependability of its reported findings. When available, 
consumers should use such information and expertise to 
improve their critical judgments. Consumers should be 
skeptical of any survey organization not willing to make 
its procedures and relevant data available for independent 
review. Indeed, the potential consumer of the findings 
from such undocumented surveys ought to consider them 
suspect. Only by exercising their full discriminatory 
powers and taking advantage of available professional 
resources can survey consumers be assured that survey 
organizations will take all of the steps necessary to 
maintain practice at the highest level. 



UPGRADING CURRENT SURVEY PRACTICE 

There are two distinct desiderata for improving survey 
practice. One is to elevate the state of the art, so 
that there can be better surveys in the future than are 



49 

possible today. The other is to upgrade current 
practices, moving their quality closer to the limits 
imposed by the current state of the art. In fact, the 
present state of the art is a great deal more advanced 
than one would imagine from reading the widely published 
criticisms of surveys of recent years. Authors of many 
of these criticisms may have missed one of their stronger 
rhetorical points in failing to hold up the better surveys 
as exemplars against which to compare the poorer ones. 

No doubt some criteria of excellence are in part 
incompatible with other criteria, so that laying out 
specific regulations would be a thankless if not hopeless 
task. 23 But that is only to say that no set of 
prescriptions can usurp the role of considered judgment 
in balancing alternative means and ends. We believe that 
a knowledgeable working party of six or eight people could 
produce a document specifically oriented to surveys of 
subjective phenomena comparable to the booklet, "Standards 
for Discussion and Presentation of Errors in Survey and 
Census Data" (Gonzalez et al., 1975), based on practice 
in the Bureau of the Census and other government agencies 
engaged in fact-finding surveys. Such a document need 
not have the force of law to be helpful; the cited 
document has only the force of example in regard to 
nonfederal survey organizations. 

Preparation of such a document would be a logical 
sequel to the work of our panel. The reader could ask 
why we did not include that task on our agenda. The 
answer would have to be that we had a much broader 
mandate that left neither time nor resources to do the 
painstaking detailed work contemplated in this proposal. 
But standards are implicit in the suggestions and 
conclusions of all our chapters, and the literature is 
not devoid of supporting technical material. 

In the next several recommendations, we focus on what 
may be done to encourage practitioners to upgrade the 
quality of individual surveys and then consider improving 
the state of the art. In order that the public may more 
effectively evaluate the quality of surveys, we offer two 
parallel recommendations to survey researchers. 

RECOMMENDATION 5 Improve the review and evaluation 
process for poll reports. 

Survey data are presented in various forms to various 
publics and the review process is thereby fragmented. 
Many data are collected for private use only (e.g., 
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in-house marketing research and political polls by the 
candidates) . Since those data are not directly presented 
to the public, there is no external review process 
although everyone feels the impact of the marketing and 
political decisions that are partly shaped by those data. 

Of the data that are publicly released, a large body 
is collected by or reported in the mass media/ including 
the regular privately conducted polls by Gallup, Harris, 
and others. Although it appears (see Chapter 3) that 
public review is spotty and usually lacking in depth, one 
does, nonetheless, encounter some examples in which poll 
results are disputed, reanalyzed, and reinterpreted.^ 4 

A second major vehicle for the dissemination of survey 
data is scholarly publication. Data collected for 
academic purposes (e.g., the national election studies of 
the Institute for Social Research [ISR] and the General 
Social Survey of the National Opinion Research Center 
[NORC] ) are evaluated and reviewed by the agencies that 
funded the data collection. There are various scholarly 
guidelines on preferred survey practices and methods, but 
no authoritative set of standards for design, collection, 
or analysis of survey data. Of course, journal articles 
undergo editorial and peer review before acceptance and 
are subject to general academic scrutiny after 
publication. There are instances of vigorous exchanges 
such as that between Nie and Rabjohn (1979) , Bishop et 
al. (1979), and Sullivan et al. (1979) over the impact of 
question wording on conclusions about the belief systems 
of the public (Converse, 1964), and the recent debate 
between Caddell (1979) and Miller (1979) over the 
American "crisis of confidence." Social scientists, 
however, have generally been insufficiently attentive to 
survey methods. Because of this, even scholarly survey 
research often does not get as thorough a review as it 
should (see Presser, 1980) . 

Finally, we note that surveys conducted by the federal 
government are usually subject to rigorous internal 
review and evaluation. The Census Bureau in particular 
carries on a significant program of methodological 
research designed to improve its measurement procedures. 
And when published, government survey reports are often 
subject to scholarly evaluation and criticism. We note, 
nonetheless, that a recent review (Bailar and Lanphier , 
1978) of a small sample of federally sponsored surveys 
suggested that research practice in this domain is still 
far from adequate and that considerable improvements 
should be sought. 
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In addition to these usual procedures, there are ad 
hoc evaluations. Occasionally an interested party is so 
agitated by a survey (or its results) that a special 
evaluation is launched. Mayor Meier of Milwaukee, for 
example, had three groups of outside consultants examine 
a pair of polls conducted by the Milwaukee Journal 
(Gimbel, 1979); Serge Lang launched an extensive letter 
writing campaign against the 1977 survey of American 
professors (Lang, 1981); and the Minneapolis Tribune, 
unhappy with the results of its own Minnesota poll, had 
an evaluation conducted by associates of the Institute 
for Social Research (Campbell et al. , 1979). Each of 
these is an example of valuable ad hoc evaluations and 
reviews. 

There exists, in sum, a scattered system for evaluating 
survey data. Often this system works quite well. Careful 
reviews are carried out and useful criticisms are made. 
But the patchwork nature of this system also implies that 
it often fails completely or works only belatedly. And, 
unfortunately, the book review columns even of scientific 
journals seem hopelessly wedded to literary criteria. 25 
Few surveys per se are reviewed, and there are practically 
no reviews of polls or surveys that are not in published 
books. Thus, a stream of survey reports issues from 
government agencies and such major data producers as the 
Survey Research Center without undergoing formal and 
critical external review. 

We suggest that the survey research methods section of 
the American Statistical Association organize a forum for 
the publication of reviews of polls and surveys. There 
are interesting precedents. The American Statistician 
now reviews computer programs and teaching materials, and 
Contemporary Sociology reviewed the Cumulative Codebook 
of the General Social Survey. The Public Opinion 
Quarterly and Public Opinion magazine routinely present 
compilations of survey and poll results. These 
precedents might be expanded to include critical reviews 
of surveys and polls. Of course no single journal can 
review all survey reports. But even a highly selective 
coverage of current output would be most valuable. 
Review articles, as distinct from mere compilations of 
survey results, would also be very useful. 

To assist in upgrading the quality of current surveys, 
the following recommendation is addressed to the 
producers of survey data. 
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RECOMMENDATION 6 Make independent peer review a part of 
the design, analysis , and reporting of surveys. If 
voluntary reviews are not available , then a portion of 
the resources of the survey should be allocated to obtain 
paid reviews. 

Peer review is a feature of all science. However r the 
applied nature of survey research and the institutional 
structure of the industry may make competent reviews 
difficult to obtain. Thus/ many surveys are not reviewed 
at all within the scientific community. Such public 
commentary as is generated often comes in political 
arenas, in which the participants may be less concerned 
with the quality of measurement than with the political 
implication of a survey. 

One alternative is to have independent review of a 
survey be a standard budgeted expense. Review should 
come both early in the process, so that the design of the 
survey can be improved, and late, so that the most honest 
reporting possible will occur. A survey that followed 
such a procedure might receive some special designation 
(as do those that abide by standards of disclosure of the 
National Council of Public Polls) . This might encourage 
those who sponsor surveys to undertake the additional 
expense . 

Although there are obvious problems with ensuring the 
independence of reviewers who are on the payroll, some 
reasonably sound arrangement might be made with the help 
of existing institutions or the independent center whose 
establishment we recommend below (Recommendation 13) . In 
any case, we believe that more vigorous institutionalized 
self-criticism within the scientific and survey research 
community can be an effective way to regulate the quality 
of survey research. 

Election polls are in particular need of continuing 
review, because of their potential significance to 
candidates, voters, and the survey research community, 
and we therefore address our next recommendation to the 
relevant sectors of the professional survey research 
community. 

RECOMMENDATION 7 Establish a panel or committee to 
evaluate the performance and methodology of election 
polls. 

The history of election polling reveals periodic 
public criticisms of the accuracy of election forecasts 
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(see Chapters 1 and 2) . This dates back at least to the 
Literary Digest fiasco of 1936. In 1948, a public outcry 
followed the erroneous "election" of President Dewey by 
the polls (the Gallup and Crossley polls predicted Truman 
would lose with 44 percent of the popular vote, but 
Truman won with almost 50 percent of the popular vote) . 
This led to a specially commissioned study by the Social 
Science Research Council (Hosteller et al., 1949a, 1949b) , 
This example has not, however, been repeated even though 
the raw errors in predictions have exceeded in some cases 
the discrepancies between the polls and results in the 
1948 presidential race. For example, in the 1978 
elections, several governors and members of congress 
"elected" in the polls went down to defeat in the 
election. Similarly, in the 1980 presidential election, 
Ronald Reagan's substantial margin of victory over 
President Carter (51-41 percent) was not anticipated by 
most of the public polls (Mitofsky, 1981) . 

We were not charged with reviewing election forecasts, 
and we have not, therefore, devoted detailed attention to 
this topic. We believe, however, that a regular review 
of the accuracy of such forecasts could be of use both to 
the survey industry and to the public. We note in this 
regard the example set in Britain, where election post 
mortems have been held under the auspices of the Royal 
Statistical Society. Such evaluative activities can 
upgrade survey practice and public understanding of 
survey results. A committee appointed to evaluate 
American election forecasts might well function under the 
institutional auspices of the center whose establishment 
we propose below (Recommendation 13) . 

Allocating Resources . Many things that could result 
in the improvement of surveys are technically feasible 
and economically in reach without major new expenditures 
for survey research. Of course, such expenditures might 
be justified on other grounds, primarily the need for 
more information or information of a different kind than 
is now available. We have not tried to assess such a 
need, but we do strongly feel that increments to the 
quantity of survey research must not be purchased with 
decrements to its quality. Improvements in the quality 
of surveys do not inevitably require increases in total 
costs. In this regard, we address our next 
recommendation to survey sponsors and survey 
organizations. 
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RECOMMENDATION 8 Make a better allocation of resources 
for surveys. 

It is quite possible there are too many surveys. 
Certainly there are too many surveys of mediocre and 
inferior quality that do not accomplish anything useful. 
There are too many that report on matters of only the 
most transitory or parochial interest. There may be too 
many in the light of their burden on respondents, but we 
do not know. If any of these situations exist, we join 
others in deploring the state of affairs, without having 
any general remedy to offer other than relentless 
publicity for substandard and superior practice in the 
professional and public media. Respondents must regulate 
their own burden by choosing not to respond to some or 
all surveys. Evidence suggests that respondents' refusal 
to participate in a given survey is associated with their 
(stated) perception of the value of surveys in general 
rather than their perception of the value of the 
particular survey in which they are asked to participate 
(Frankel and Sharp, 1981:110). It is thus to the 
advantage of the survey profession to decrease the 
frequency both of substandard polls and of sales and 
direct mail solicitations that masquerade as surveys. 

More specifically, our recommendation implies that 
each survey organization should re-examine the ways it 
spends its money. In addition, each agency that 
contracts for or funds surveys can adopt policies to 
improve the pattern of resource allocation. In particular 
we urge survey organizations to consider the following 
propositions : 

(1) There is too little methodological work of the 
right kind; 

(2) There is too little investment in pilot studies 
and auxiliary investigations. 

Or, to state the point in a different way, there are now 
many small-scale studies with a few hundred respondents 
that are manifestly inadequate for making serious 
estimates of important quantities and relationships but 
that are splendid as pilot studies. The problem is that 
they are not treated as such but are regarded as completed 
investigations. Apart from candidate preference polls 
and single-question referenda, serious surveys are seldom 
interested in just estimating a single proportion. The 
interest is in comparisons of population strata and in 
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multivariate relationships. The samples used for these 
purposes are often much too small appropriate enough for 
pilot studies but not for the supposed substantive 
purposes of the research. 26 In academic research, one 
of the few instances in which this has been acknowledged 
is NORC's General Social Survey. Its annual sample size 
is only 1,500, but most of its questions are repeated 
year after year; hence, the sample size accumulates 
gradually in such a way that serious analyses ultimately 
become feasible. 

In addition to too little methodological research and 
too little investment in pilot studies, we also assert 
that: 

(3) there is far too little analysis of the typical 
survey, at least from the point of view of basic 
research, whatever may be said concerning applied uses. 

It is well known to investigators with a major commitment 
to survey research that most of the data collected are 
never examined in a searching way. The ratio of analysis 
to collection is too low, and this is particularly 
distressing because analysis is relatively inexpensive 
compared with the costs of data collection. One obvious 
cause of this misallocation within an organization has to 
do with the organizational necessity of continually 
undertaking new surveys in the interest of maintaining 
cash flow. How this can be overcome is more than we can 
say, but it is a matter requiring most serious attention. 

Prom a social point of view, too many resources and 
too much of the public's time is spent collecting survey 
responses and too little are spent in finding out what 
the responses mean. This panel does not agree with one 
commentator's observation on the preliminary report of a 
survey that: "Sociologists will be spending the next year 
or so determining what these and other apparent trends 
really mean. But the raw data give the layman a pretty 
good idea right now." 

Our whole report speaks to the point that it is a lot 
more difficult than it seems to turn raw data into 
warranted conclusions. The task, however, is worthy of 
our best efforts, and hence we make the following 
recommendation to all survey organizations. 

RECOMMENDATION 9 Build into virtually all ongoing 
programs of survey data collection and research 
assessments of errors and collection of data relevant to 
methodological issues. 
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We distinguish carefully between this recommendation 
and the usual one that calls for more methodological 
research. A report by Storer of a conference on survey 
applications of social psychological questions, for 
example, noted "the pervasive tendency among social 
scientists to specialize in substantive areas, which 
tends to lower their enthusiasm for methodological 
research" and went so far as to suggest (Storer 1969:6): 

[it might be] efficient to divide 
organizational responsibility for . . . 
various types of social research, either 
through upgrading private, nonprofit 
organizations by providing stable financial 
support, or through the establishment of 
another separate survey research agency in the 
Federal government. In the short run . . . 
the essential need is for increased support of 
methodological research through whatever 
organizations are willing and able to conduct 
it. 

The suggested division of responsibility is precisely 
what we think is not "efficient." It is, on the 
contrary, a recipe for ensuring that the results of 
methodological research are never used and, in 
particular, are not used to enhance the meaning and 
usefulness of actual surveys. An organization interested 
in cutting corners can too easily assent to the solution 
of letting someone else work out the methods and then 
find, with a show of regret, that the methods they 
developed would never work. 

The separation of method from substance in the social 
sciences is intellectually stultifying as well as 
economically costly. It has resulted in a self-sustaining 
methodological" literature read only by methodologists 
discussing methods that have never been shown to work 
because their sponsors have never been responsible for 
solving actual substantive problems. Successful 
methodological development has almost always involved 
intimate interaction between methods experts and 
substantive investigators (sometimes an interaction going 
on within a single cranium) . 

Of course, some methodological questions will require 
intensive and expensive investigations that might well be 
carried out separately, perhaps under the auspices of the 
center recommended below. However, we protest the idea 
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that methodological work is in general a costly overhead 
or luxury for the survey organisation. If anyone today 
were to suggest that calculation of sampling errors is 
not a part of a respectable survey but should be left to 
separate "methodological studies/' he or she would not 
receive a sympathetic hearing. Among professional survey 
researchers, it is taken for granted that a survey is not 
complete until estimates of sampling variation have been 
computed. Sooner or later survey researchers and survey 
users must accept the equally compelling argument that 
estimates of nonsampling sources of variation are 
indispensable for a competent survey, anc3 then the 
bifurcation of substance and method will be seen as 
equally absurd. 

To be concrete: If there is an issue as to the 
wording of a question, with several plausible 
alternatives, the obvious way to resolve the issue is 
with experiments trying the alternatives. The outcome 
can be either that the wording does not matter (i.e., 
essentially the same results are obtained with several 
different wordings) or that the wording does matter, if 
the former, the inclusion of both wordings in the main 
survey (not in some separate methodological experiment) 
strengthens the conclusions. But if the wording does 
matter, then surely a reliable estimate of the wording 
effect is required, not merely for some "methodological" 
purpose but to make a complete report on the substance of 
the investigation. 

Methodological experts should be asked to provide 
something they are perfectly capable of providing, to 
wit, survey designs^? that build in internal estimates 
of nonsampling as well as sampling variation without 
compromising the substantive goals of the study (see 
chapter 5 for examples of these techniques) . We note 
that such procedures may become quite inexpensive to 
implement in the future as more survey organizations adopt 
computer-assisted telephone interviewing, We also point 
out that the results of such factorial designs can be 
conveniently handled using the modern survey analysis 
procedures we recommend (see Recommendation 14). 

We do not question the need for preliminary and 
auxiliary studies to develop new questions or procedures, 
but we believe that such studies will be most useful when 
they are part of the process of designing serious 
substantive surveys. Even more important, they cannot be 
relied upon by themselves to provide the needed estimates 
of sampling and nonsampling variation. Those estimates 
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must come from the main survey itself, if mak ing them is 
a methodological activity, then our new motto must be? 
Every survey a methodological inquiry " 

" " eX V eCOmmendati0n iS also di ^cted toward the 
Proved understanding of the factors, particularly 

ones, that cause variation in survey results. 




among agencies be seen as a normal 
state of Pr V i n9 the ^neral level of performance a 
1 * - * K " ' WS are not ' of course ' advocating any 
mandated sharing of proprietary information or any 
legislated uniformity in topics studied. But when 
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scientists to bring their decentralized analytical 
measurements under statistical control. 

There is room for considerable action by those who 
commission surveys as well. Organizations that contract 
for large-scale surveys of subjective phenomena can both 
build into such surveys methodological investigations (of 
the type noted above) and split the field work contracts 
for surveys between two or more commercial houses. The 
variability in the resultant measurements can provide a 
useful insight into the effects of uncontrolled 
differences between organizations in such things as 
interviewer staffs, training, coding, etc. They can thus 
provide useful information on the range of variation 
(above sampling error) that might be anticipated to be 
produced by fluctuations in organizational factors when 
there is no true change in the population a necessary 
criterion for inferring change when surveys are 
subsequently replicated. 

Cooperative ventures of these sorts might well hold an 
important place on the agenda of the center we propose 
(Recommendation 13) . 



ADVANCING SURVEY MEASUREMENT AND 
THE SCIENTIFIC USE OF SURVEY DATA 

Our assessment of the present state of the art of survey 
measurement and analysis is that it is ripe for 
improvement. If survey researchers take advantage of the 
growing body of scientific knowledge and recent advances 
in statistical methods in order to improve their 
measurement and analytical techniques, significant 
improvement in the quality of survey research can be 
expected. We also find that the field is on the 
threshhold of a far more thoroughly scientific use of 
survey data that can advance theoretical understanding of 
the dynamics of public opinion and public sentiment. 

In the following recommendations we first consider 
what should be done to improve the state of knowledge and 
next suggest steps to advance the scientific use of 
survey data. The recommendations in this section are 
addressed to all those who are now involved in the survey 
enterprise and to some whom we invite to join it. 

RECOMMENDATION 11 Broaden the basic science component of 
survey design and practice. 
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A great deal of what is done in a survey organization 

aettfno Vh n ^f ^^ and n inf rmal Pr**riPtions for 
<- i? ne * Alternat ives are not often 
systematically considered, nor are there explicit 



science and anthropology) . 
What can happen when a fundamentally scientific 
approach is taken is illustrated by past achievements in 
survey sampling, which during the 1930s and 1940s bega^ a 
sustained evolution toward a coherent and powerful body 

1934 iqL tr n nSlated intO codified Practice (Neyman , 
1934, 1938; Deming, 1950; Yates, 1949; Cochran, 1953. 
Hansen et al. , 1953; see Stephan [1948] for a nistory of 
early use). Bother parts of the survey enterprise have? 
to be sure, been subject to careful study. Just a few 
monographs are noted here (Hyman et al., 1954- Bailar 
1975; Bradburn et al. , 1979; Groves and Kahn, 'l979; ' 
Schuman and Presser, 1981), but there is not yet a body 

Of knowlfiftoo rm <-.4-k A . *. -__ . _ . * a UUU Y 
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rigor to the works on survey sampling. 

We do not believe, moreover, that such a body of 
knowledge will be developed if reliance is placed 
exclusively on the efforts of the political scientists 
social psychologists, sociologists, and statisticians who 
are providing most of the basic scientific input to 
survey work at the present time. We expect 
anthropologists, cognitive scientists, linguists, and 

a^ono^H T ln Vari US branches of the humanities, 
among others, to make novel and productive contributions 

formal i T^' W Uld like tO See the -erious and 
tormal involvement of some such mix of scholars in th* 

he 
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scientis 6 Pr VeS manY UniqUG PPortunities for 
scientists. For example, the survey context could 

studv % eXP6rimental P s y chol gi^s interested in the 
study of memory and cognition with a rich real-world 
laboratory complete with large random samples of the 
10n (ather than the Allee student. 
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usually used as subjects in such experiments). 

We have an implicit agenda in calling for this 
program, if successful results were forthcoming from 
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such a collaborative research program and were reported, 
among other places, in disciplinary journals, then other 
scientists might well be stimulated to follow suit. If a 
good example were initially set and the field fertile, in 
time many similar endeavors might bloom from this initial 
seed. 

RECOMMENDATION 12 Organize a systematic long-term study 
of all phases of the survey process. This should include 
an extensive interdisciplinary investigation of 
subjective aspects of survey questions* 

Even for most people engaged in survey research, the 
interview is something of a black box there is the 
printed questionnaire and interviewer's instruction 
manual and the codebook and data tape, but little else by 
way of systematic documentation of how the data were 
actually generated. A deeper understanding is needed of 
how this instrument the survey interview works. Survey 
research should have greater intellectual input from 
other social and cognitive scientists. Chapters 8 and 9 
present a preliminary reconnaissance of what some of 
these fields might contribute to our understanding of the 
interviewing process. We find much material including 
empirical results, methodological innovations, and 
theoretical constructs that could be exploited to 
improve survey measurement, and in particular to improve 
understanding of the nature of the interaction between a 
respondent and the interviewer and between a respondent 
and the survey questions themselves. Collaborative 
investigations of interviewing that draw upon ideas and 
methods from all relevant scientific disciplines can 
deepen understanding of the structure of the errors 
affecting survey data and can suggest innovations in 
concepts and methods of survey research. 

We offer several possible approaches, without 
prejudging which (if any) will prove fruitful. First, we 
have presented (see Chapter 9) a heuristic classification 
of the ostensible tasks that survey questions pose for 
respondents. The analysis of such a classification, 
informed as well by evidence from psychology, might 
provide a starting point for studies of how and how 
well these tasks are executed. 

A second approach considers the interviewer. Carefully 
designed studies of the interviewer's contribution to 
survey error demonstrate that variability due to 
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interviewers can be extremely high (see, for example, 
Bailey et al., 1978). However, we have little 
understanding of the dynamics. (In Chapter 8 we offer a 
preliminary discussion of the dynamics of the interaction 
between respondent and interviewer, which might be 
expanded into a more systematic analysis of their mutual 
roles and the effects of role expectations upon the 
quality of survey data. The inquiry might include the 
interviewing of interviewers or the systematic use of 
debriefing.) Our hunch is that interviewers have more 
wisdom about some of the problematic aspects of surveys 
than is usually taken advantage of, although they also 
have the biases of their particular perspectives. Of 
course such biases are also worthy of investigation. 
A third approach is to develop error profiles and 
models of total survey error (see Chapter 4 and Lessler, 
Volume 2) , 28 Finally, we note the extreme importance 
of developing a systematic framework for the 
interpretation of the effects of changes in the wording 
of survey questions. Wording (and other) changes 
sometimes have a substantial effect on response 
distributions and relationships among survey variables, 
and yet in general one cannot accurately predict when or 
explain why such effects occur (see Chapter 5) . We 
believe that theoretical development, drawing on ideas 
and methods from other scientific disciplines is needed 
in order to better understand the sources of meaning and 
frames of reference that respondents employ in answering 
survey questions. 

RECOMMENDATION 13 Establish a center for the study and 
evaluation of survey methodology. 

We believe that methodological research on subjective 
measures will be most fruitful when carried out as part 
of significant substantive investigations. However, 
there will be times when concerted methodological work 
may be called for. Much of the early understanding of 
interviewing comes from the large-scale project reported 
by. Hyman et al. (1954) ; appreciation of question wording 
problems was greatly stimulated by Cantril (1944) ; and 
concern for interviewer variance in attitude measurement 
was heightened by the early work of Kish (1962) . 

An independent center for survey methodology (or a 
consortium of cooperating centers) could provide a locus 
for future interdisciplinary research on survey methods 
(see Recommendation 11) and for long-term intensive 
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investigations of the sort suggested in Recommendation 
12. Such a center could also facilitate cooperation 
among survey organizations (see Recommendation 10) , 
addition, a constructive role in improving the review 
process (see Recommendations 5 and 6) could be played by 
an independent center set up for the study and evaluation 
of survey methods. 

One of the most effective rebuttals to flawed or 
biased polls is the reporting of the results of less 
flawed (or less biased) polls. We have noted (see 
Chapter 3) the salutary example of a North American 
Newspaper Alliance poll conducted in rebuttal to a New. 
York Times/CBS poll claiming a political realignment in 
the United States. Simiarly, we note examples of 
criticism (Schuman and Presser, 1980) in the popular 
media of a poll on support for a constitutional amendment 
to ban abortions; of a critical review of polls on 
nuclear energy (Mitchell, 1980) ; and of a critique of 
polling on the second Strategic Arms Limitation Treaty 
(Lanouette, 1980). An independent center for the study 
and evaluation of survey methodology might serve as a 
forum for debate on the merits of particular polls and 
surveys, and it could provide institutional resources for 
the commissioning of both independent polls and critical 
reviews such as those above. The dissemination of such 
work might foster more informed public discussion of the 
faults and merits of polls (see Recommendation 2). 

RECOMMENDATION 14 Modernize survey analysis. 

In various parts of this report and in our volume of 
supplementary studies we have furnished illustrations of 
models and statistical methods that offer improvements 
over conventional models and methods in survey work. The 
branch of statistical methods most relevant to the data 
collected in surveys those dealing with categorical ^ 
data was markedly underdeveloped up to a decade ago. 
But times have changed. The major textbooks of Bishop et 
al (1975), Goodman (1978), and Haberman (1978/1979) 
record a development almost as remarkable in its way as 
the evolution of sampling methods that we noted above. 
The dissemination of these tools to the survey 
practitioner together with a heightened awareness of the 
need for such multivariate strategies would be a 
significant advance. Moreover, these analytic methods 
have the coincidental advantage of encouraging the use o 
factorial designs for the study of wording, context, and 
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other effects. By providing a convenient vehicle for 
their analysis, these techniques facilitate a forthright 
confession by survey practitioners of the fundamental 
truth that responses often depend on how a question is 
asked. This fact, which is ordinarily hidden from public 
view, can and ought now to be routinely incorporated in 
survey design. 

In addition to increased use of categorical analysis 
techniques, we suggest graphical analysis and exploratory 
data analysis techniques (Tukey, 1977; Hosteller and 
Tukey, 1977) . The use of these techniques may often lend 
additional support to the interpretation of results, but 
they also may point out anomalies in the data. The use 
of these techniques may yield insights not provided by 
other analytic procedures. 

These domains of statistical methods continue to be 
active, with important new work being reported 
regularly. If surveys fail to improve, the fault will 
not lie with the statisticians, for they have given the 
survey craft some very powerful tools indeed. If 
practitioners continue to suppose that they can get along 
with homemade implements, the survey community will have 
itself to blame for falling far short of its potential. 

We offer our next recommendation to all users of 
surveys and poll data. 

RECOMMENDATION 15 Examine periodically the conceptual 
basis for the use of poll and survey data in scientific 
disciplines and applied fields making heavy use of them. 

This recommendation is not as bland as it sounds. The 
panel struggled in various ways with the question of what 
surveys especially those dealing with subjective 
phenomenaare good for. We find that the conceptual 
foundations for much ongoing survey activity are not 
firm. A great deal in the way of "psychologizing" seems 
to be involved in the interpretation of survey data, but 
the psychological theory involved is almost wholly 
implicit and often incoherent. This is not to say that 
the theory should be faulted for poor application. We 
struggled with the ways in which psychological theory 
might be more effectively brought to bear upon the survey 
process (see Chapter 9), and this indeed was one of the 
motivations for our previous recommendation that a 
long-term collaborative study of the survey process be 
organized. 
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The substantive scope of surveys, however, is 
exceedingly broad, ranging over matters dealt with by all 
the social sciences. But for many of these sciences 
surveys are on the periphery and their conceptual problems 
do not command attention. We believe that public opinion 
research has become overly preoccupied with questions that 
may be answered solely with survey data to the neglect of 
theoretically central issues calling for other kinds of 
data or for systematic supplementation of surveys with 
auxiliary information (see Chapter 7) . 

One area for consideration is social indicators 
development, as recorded, for example, in Social 
Indicators III (Bureau of the Census, 1980) , a 
publication issued too recently for thorough assessment 
by this panel. But we note that the effort to define a 
niche for subjective indicators within the broader 
context of social indicators has attained at least 
temporary success. In Social Indicators III, subjective 
measures are prominently displayed and editorially 
justified. We thus have at least a first approximation 
to what Campbell and Converse (1972:9) asked for: "some 
direct monitoring of key social-psychological states of 
the population." The question now is how to assess this 
accomplishment and to discover what the information is 
good for. 

One must ask whether the tangibility or meaningfulness 
of such subjective measurements can be substantiated. 
Too often it is assumed that anything to which people 
will respond constitutes a measurement. This, however, 
begs the question of whether there exists something to be 
measured. It is surely difficult to demonstrate that 
such measurements fit into some coherent pattern of 
orderly (theoretical or empirical) relationships. It is 
not, however, beyond the reach of our current abilities. 

Validation provides a fundamental challenge for all 
who would take subjective measurements seriously, and it 
is not an inherently unattainable goal. To assist in 
meeting this basic challenge of scientific measurement we 
make two recommendations related to external validation 
of surveys. 

RECOMMENDATION 16 Put surveys in context with 
appropriate auxiliary data. 

Most of our report focuses on matters internal to a 
survey. But one could argue that the context of a survey 
is equally important to its effective use. An 



66 

illuminating discussion by G. Carlsson (1970) takes note 
of the predilection of sociologists (in contrast with 
economists) for studies that "deal with variation between 
units at one point in time, cross-sectional variation, or 
what might be called differential behavior" in short, 
with the one-time survey. The author goes on to note the 
small likelihood of success in building systematic theory 
or in understanding historical change on the basis of 
this research strategy. In a study of shifts in the 
agenda of public opinion, MacKuen (1981) argues the 
necessity of a dynamic modelling approach and effectively 
exploits time series of survey data and indexes of media 
content in demonstrating the role of the media in 
bringing about such shifts. A general methodological 
statement was offered by Beniger et al. (1978:118), whose 
study of trends in opinion on abortion was intended to 
illustrate the need for social reporting 

to include measures of discrete events, mass 
media coverage and public opinion, analyzed 
both longitudinally and cross-sect ionally, in 
the context of other national problems, and 
that monitoring the interrelationships among 
these elements of [social] change, is a 
central task of effective social reporting . 

(Further examples of this work may be found in MacKuen 
and Beniger, Volume 2.) 

We assume a growing stock of survey data (as discussed 
in the next recommendation) and a carefully planned 
program of increments to this stock. But even this is 
not enough, according to the authors just cited. We 
therefore suggest the "outrigger principle" (which their 
work adumbrates) in which there are two supports for the 
main vessel. The main vessel is the program for repeated 
cross-sectional surveys to ascertain changes in response 
levels and differential changes for population subgroups. 
It is supported, on one side, by studies of media output, 
organizational activities, political debate, etc. that 
provide indicators of the everchanging complex of stimuli 
that may move public opinion and commitment. On the 
other side, the main vessel is supported by special- 
purpose, in-depth surveys of leadership cadres, members 
of relevant groups, communities where effects may be 
greater or sooner, and of panels, which provide 
supplementary data on the process of change, calibrated 
by items shared with the main surveys. 
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Further supplements to traditional surveys ought to be 
considered. We note proposals (e.g., Zeisel, 1980) for 
entirely different sorts of surveys that would involve 
exposure of random samples of the population to extensive 
arguments about a topic prior to the elicitation of an 
opinion. We also note the innovative use of laboratory 
studies as a supplement to surveys of topics that are not 
frequent concerns of respondents (e.g., the risk 
consciousness of residents of flood-prone and earthquake- 
prone areas; see Kunreuther et al., 1978). In this work, 
primitive causal theories as to what led people to buy 
insurance against such risks were developed from 
cross-sectional survey data. The theories themselves 
were then tested in a series of laboratory experiments 
that afforded greater opportunity for the control of 
relevant variables in (mock) decision-making tasks. 
These studies were then complemented by econometric 
analyses of actual market data. 

We cannot endorse the particulars of any specific 
approach as a panacea, but the external validation of 
surveys will be greatly facilitated by tying the attitudes 
and attitude changes measured in survey data to their 
causes wherever they may lie. Such auxiliary data 
permit causal analysis of the origins of attitude 
formation and change. One might then be in the position 
of examining methodological questions such as inter survey 
discrepancies in estimates from the advantageous 
perspective of models that make predictions (tested by 
past data) as to the expected behavior of a given time 
series of measurements. It is possible that some of the 
discrepant survey measurements that now confound us will 
appear regular and lawful in the light of substantive 
theories describing the processes that generate changes 
in the underlying phenomena ,that a survey is trying to 
measure. 

In making the recommendation that surveys be placed in 
context, we are not proposing an enterprise that can be 
casually undertaken. It is not our intent to further 
encourage offhand observations that, for example, a dip 
in presidential popularity coincides with some particular 
presidential action. (Rigorous testing of the sorts 
discussed above might, however, turn such speculations 
into warranted inferences.) 

Our next recommendation is aimed at facilitating 
implementation of Recommendation 16. 
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RECOMMENDATION 17 Maintain, enlarge, improve, and 
exploit archives of survey and auxiliary data. 

Archives of poll and survey data have burgeoned in 
recent years; the Roper Public Opinion Research Center, 
the Inter-University Consortium for Political and Social 
Research, and the Louis Harris Data Center are prominent 
examples. Secondary analysis has become an important 
modus operandi of a large sector of social research and 
has even been studied as a movement and codified as a 
technique (Hyman, 1972). The usefulness of archival 
data, particularly for the analysis of social change, has 
been advanced in recent years by the publication of 
indices of repeated questions (Southwick, n.d.; Center 
for Political Studies, 1976; Martin et al. , 1981). 

We favor the encouragement of this kind of work, but 
we also ask for something more, namely, the feedback of 
secondary analysis into current surveys that might be 
deliberately designed to illuminate issues arising in the 

^? 1Va u T^; P r exam P le ' in <^ne 1969, the Gallup 
poll asked the following question: 

President Nixon has ordered the withdrawal of 
25,000 United States troops from Vietnam in 
the next three months. How do you feel about 
this do you think troops should be withdrawn 
at a faster rate or slower rate? 

Of the respondents, 42 percent answered "faster," 16 
percent "slower," 13 percent had no opinion, and 29 
percent volunteered the response "same as now." Three 
months later the Harris poll asked respondents: 

In general, do you feel the pace at which the 
president is withdrawing troops is too fast, 
too slow, or about right? 



lo W o P011 btained responses of 28 percent "too 
slow,' 49 percent "about right," 6 percent "too fast," 
and 18 percent no opinion. Unfortunately, we do not know 
how much of the difference between the Gallup and Harris 
poll results is due to events occurring in the interval 
between the polls, and how much is due to the differences 
between the two questions (Schuman and Duncan, 1974:2331? 
Are thL tW , qU ?f 10ns reall y Siting at the same issue? 
Are they actually measures of the same latent attitude 
or are they dealing with different though related 
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dimensions? What is required to answer such questions is 
the design of new measurements so that such issues/ which 
routinely arise in the archival record, could be studied 
by experimentally varying the question wording used in a 
future survey. 

As a second example, measurements by Louis Harris and 
NORC of Americans' "confidence in national institutions" 
have been found to be discrepant even though the wording 
of the questions and the survey dates were closely 
comparable (see Turner and Krauss, 1978? Smith, 1981; and 
Turner, Volume 2). In this case, the discovery that 
different survey organizations asking the same questions 
obtained different results (indeed, even the estimates of 
trends in "confidence" were not the same) sparked a 
debate that was part of the motivation behind the present 
study. The existence of archival records of such surveys 
makes intensive scientific study possible and allows one 
to learn from the experiences of the past even when the 
significance of such experiences is not recognized at 
first exposure. 

Much broader questions of this same kind can easily be 
proposed. Answers to such questions, narrow and broad, 
would both enhance the capital value of the stock of 
survey data and improve understanding of data now being 
collected. Analysis of archived data would be even more 
informative if survey organizations were to routinely 
coordinate their measurements as we propose in 
Recommendation 10. (We note that the General Social 
Survey has been responsive to investigators' concerns 
about the meaning of its questions, most of which are 
copied from earlier surveys. In the eight years since 
its inception, GSS has conducted approximately one dozen 
split ballot experiments, about half of these at the 
instigation of data users.) We wish to encourage a more 
vigorous interaction between the analysts and producers 
of survey data. The product of this interaction 
would benefit both parties and might result in a general 
improvement in the state of theory and survey practice. 

We hope that more survey organizations and polling 
enterprises will deposit their data in archives. The 
example set in the commercial sector by those 
organizations, such as Harris and Gallup, that routinely 
release most of their surveys to public archives 
deserves to be more widely emulated. Concerns about 
commercial and proprietary issues could in many cases be 
met by establishing an appropriate embargo period prior 
to public release of the data. When survey results have 
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been publicized, we think that fair play dictates that 
the data themselves be made publicly available at the 
same time. We take this to be, in part, the spirit of 
the rules on public disclosure recently adopted by the 
National Council on Public Polls. 

We also urge the development and expansion of archives 
of auxiliary data. We are favorably impressed with 
efforts to explain and interpret trends in public opinion 
in the larger context of social and cultural events. We 
have included empirical demonstrations of such analyses 
in Volume 2, and we have in a previous recommendation 
proposed more extensive linkage of opinion and other 
time-series data. To this end, we also propose the 
archiving of data on environmental and media events that 
will provide the frameworks for understanding the 
dynamics of opinion formation and change. The Vanderbilt 
Television News Archive provides one example of such an 
endeavor. The scope of the auxiliary data that will be 
useful, however, extends far beyond that derived from 
television news, and substantial work is required to 
transform such raw data into a form that will be useful 
in the analysis of public opinion. Central tasks in such 
an endeavor will be the careful assessment of the 
coverage and reliability of coding in extant data bases. 
Our experiences with The New York Times data bank, a 
computerized index to archived news stories, and the 
services offered by publicly available clipping services, 
suggest caution in evaluating any claim of comprehensive 
cataloguing of the media's daily product. Severe and 
selective undercoverage is probably typical (see Appendix 
C). However, new technologies, such as the automated 
retrieval of news items from computer formatted wire 
services, may alleviate some of these problems. 

Indexing, however, is a more substantial intellectual 
problem. What is ordinarily desired for analysis are 
counts of the frequency (or relative density) of news 
^r^ 96 ^. particular to Pics. The definition of topics 
and the coding of subject matter are problems that are 
neither new nor totally intractable, but they are 
difficult and crucial. The development of appropriate 

d^tTft'K the organizati on ^d indexing of archival 
data will be necessary for their effective use. 

Our final recommendation turns to the broadest 
society ^ P SSible ' t0 the Question of surveys and 
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RECOMMENDATION 18 Study the role of polls and surveys in 
modern society* 

Here we broach the notoriously treacherous subject of 
Wissenssoziologie, the study of knowledge as a social 
product and social instrument, how it is produced and how 
it is used. On the far side of such inquiry are the 
epistemological issues that first fascinated the German 
scholars who delineated this domain of research (e.g. , 
Mannheim, 1931/1936) . But much can be done to produce 
enlightenment about surveys on the nearer side of such 
issues. As a first step, someone should write the 
chapter on polls and surveys that is missing from an 
eminent economist's monograph on The Production and 
Distribution of Knowledge in the United States (Machlup, 
1962) . Our preliminary skirmishes with this task 
(recorded in chapters 2 and 3) indicate some of the 
difficulties and pitfalls. The effort serves to convince 
us of the utility of a "survey of surveys," surely a task 
at which the industry would excel if it chose to 
undertake it. 

Our own exploratory study of the use of polls by the 
media provides some preliminary lessons. First, a 
veritable avalanche of poll reports is published by the 
American and British press. We have been able to do 
little more than cite illustrative cases of abuse that 
follow from lax standards of reporting, but a sustained 
effort could provide reliable knowledge about the use of 
polls both as a method of social communication and as a 
method for the self-education of the society about its 
own views. A more detailed and richly textured 
understanding of the ways in which published polls are 
used by whom, under what circumstances, and for what 
purposeswould be of value. In the light of the recent 
research on the influence of the media upon the public 
(see Volume 2, MacKuen; Beniger; Marsh), it would be 
interesting to know more about the feedback effects, if 
any, that occur when the public is told what the majority 
thinks about its presidents, the malaise of the national 
spirit, and so forth. Our explorations suggest that this 
will be a rewarding topic for those willing to undertake 
careful and systematic studies. 

The second lesson is that the content analysis 
procedures required in such work must involve a series of 
successive approximations. to develop a reliable and 
meaningful coding scheme. Our own attempt has, we hope, 
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been described with sufficient detail to suggest problems 
that will be encountered. Future work may be facilitated 
by creative applications of automated retrieval systems 
to computer-readable data banks of news stories (see, for 
example, Erbring, 1980} not summaries or indices, which 
can hide errors of coding and coverage. 

Finally, we offer a speculation fueled by the 
experience of some of our members who coded and read a 
large sample of newspaper poll stories that polls and 
surveys can play (or are reported as if they do or should 
play) a role in the decision making of public officials. 
Thus, the potential exists for the abuse of surveys to 
serve the interests of participants in a political 
struggle (see examples in Chapter 3). The scope of this 
problem and the actual (as opposed to potential) impact 
of such abuses should be investigated in an intensive 
study of the ways in which public opinion is reified, 
represented, and taken into account in settling conflicts 
in society. 

The role of surveys and polls in contemporary society 
and the effect they have had in fixing public attention 
on numerology rather than the process of public debate. is 
suggested by the divergence between the following two 
perspectives: 

The following pages tell the story of a new 
instrument which may help to bridge the gap 
between the people and those who are 
responsible for making decisions in their 
name. The public opinion polls provide a swift 
and efficient method by which legislators, 
educators, experts, and editors, as well as 
ordinary citizens throughout the length and 
breadth of the country, can have a more 
reliable measure of the pulse of democracy 
. . (Gallup and Rae, 1940:14-15). 

Having been polled as a representative of the 
public, [the citizen] can then read reports 
and see how he looks. As polls become more 
scientific and detailed broken down into 
occupations, counties, income groups, religious 
denominations, etc. the citizen can discover 
himself (and the opinions which he "ought" to 
have or is likely to have) .... Public 
opinion once the public's expression becomes 
more and more an image into which the public 
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fits its expression (Daniel Boorstin, quoted 
in Gollin, 1980:449). 

We offer to social scientists the challenge of 
exploring the sources of these contrasting views as an 
entry point to a broader study of the impact of polling 
on the processes of public debate, political conflict, 
and public decision making. 



Notes 



The General Social Survey (GSS) is a survey conducted 
regularly since 1972 by the National Opinion Research 
Center (NORC) under a grant from the National Science 
Foundation. It was designed to provide a national 
data resource for social scientists, and it includes 
a great variety of subjective measurements most of 
which repeat measurements made in previous decades. 
It thereby provides the opportunity for the study of 
social changes in the United States. 
What Campbell has in mind evidently is what Mead 
(1934:5) stated in discussing the relevance of a 
psychological point of view to the analysis of the 
social act: 

That which belongs (experientially) to the 
individual qua individual/ and is accessible 
to him alone, is certainly included within the 
field of psychology, whatever else is or is 
not thus included. This is our best clue in 
attempting to isolate the field of psychology. 
The psychological datum is best defined, 
therefore, in terms of accessibility. That 
which is accessible, in the experience of the 
individual, only to the individual himself, is 
peculiarly psychological. 

Campbell and Converse presumably would not be 
proposing to construct social indicators concerning 
"psychological states" unless the latter reflect or 
respond to social factors and conditions or have some 
kind of structuring based on the social relationships 
in which survey respondents (like all of us) are 
implicated. So there is nothing to be gained by 
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arguing the issue of social versus psychological 
phenomena. The subjective phenomena of interest are 
indeed psychological in Mead's sense. Whether they 
are social facts in Durkheim's terms has yet to be 
firmly established, but the demonstration of 
patterning of responses by the social characteristics 
of respondents or by the shifting of response 
distributions synchronously with documented variation 
in social conditions (as in the example from 
Rainwater) goes a long way toward providing the 
demonstration. Of course, working out causal 
relationships is just as great a challenge here as it 
is in, say, meteorology or cancer epidemiology, so 
that the social determination of subjective states is 
never axiomatic nor commonplace. In a careful study 
(see Stinchcombe et al. , 1980) the causal labyrinth 
may become intricate indeed. 

In addition, at least some psychologists recognize 
the phenomena of repression or denial, and survey 
research has not yet offered much evidence of the 
efficacy of projective questions in bringing latent 
affect to the surface in the survey interview. 
According to a memorandum from S. S. Wilks, the vice 
chairman (9 October 1945, archives of the National 
Academy of Sciences) , the original members appointed 
in 1945 included: P.G. Agnew (American Statistical 
Association); E. Battey (Compton Advertising); H. 
Cantril (Princeton University) ; A. Crossley (A. 
Crossley, Inc.); W.E. Deming (Bureau of Budget); R. 
Elder (Lever Brothers) ; G. Gallup (American Institute 
of Public Opinion) ; P. Hauser (Department of 
Commerce); C. Hovland (Yale University); P. 
Lazarsfeld (Columbia University) ; R. Likert 
(Department of Agriculture) ; D. Lucas (New York 
University) ; E. Roper (Roper Organization) ; W. 
Shewhart (Bell Laboratories); F. Stanton (Columbia 
Broadcasting System); S. Stouffer (chairman of the 
committee, University of Chicago and War Department); 
C. Warwick (American Society for Testing of 
Materials); S. S. Wilks (vice chairman of committee; 
Princeton University) . At the time the committee was 
discharged (5 August 1954), the roster of members 
showed one additional member, G.F. Hussey (American 
Standards Association) . 

A further list included with an annual report of 
1945-6 activities lists Frederick Stephan as a member 
of the committee's staff. In a memorandum of 8 
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January 1948 to NAS President Detlev Bronk, Elbridge 
Sibley, executive secretary for the committee, noted 
the appointment of three subcommittees to advise 
directors of the research projects sponsored by the 
committee. Project directors and committee members 
were: (1) study of sampling, S.S. Wilks (chairman); 
F. Stephan (study director); W. Cochran; W.E. Deming; 
R. Franzen; M. Hansen; R. Jessen; P.F. Lazarsfeld; A. 
Politz; J.S. Stock; (2) study of use of panels, D.B. 
Lucas (chairman); P.F. Lazarsfeld (study director); 
S. Barton; W.E. Deming; A.R. Eckler; R.F. Elder; C. 
Hart; C. Hovland; T. Koopmans; F. Stephan; W. Wilbur; 
(3) study of interviewer effect, F. Stephan 
(chairman); C. Hart (study director); A. Crossley; 
W.E. Deming; G. Gallup; P.F. Lazarsfeld; R. Likert; 
E. Roper. 

Memorandum from S.S. Wilks, 2 October 1945, p. 2; 
archives of the NAS. 

Interest in this topic derives in part from the 
belief that financial support for science, external 
constraints on research, and the recruitment of young 
people into the sciences depend upon the public's 
perception of scientific activity (see National 
Science Board [1975:145] and subsequent discussion in 
the text) . 

We would note that responses for the "very happy" 
category are commonly reported for these 
measurements. One might intuit that the "not too 
happy" category would provide more reliable 
measurements because the semantic boundary between it 
and the other two categories seems clearer than that 
between "very happy" and "pretty happy." (This 
intuition is supported by the results of an analysis 
of response patterns obtained with this question; see 
Clogg, Volume 2.) However, focusing on this category 
does not eliminate the discrepancies between these 
survey measurements (see Chapter 5 for a comparison 
of responses to the complete set of categories) . 
Subsequent to the initial critique by Lang, a number 
of flaws in the sample design and execution have been 
noted (Lang, 1981) . 

When a probability sample of a population is drawn, 
most customary statistics (e.g., means, proportions, 
etc.) have an estimable probability of deviating (by 
any fixed amount) from the value that would have been 
obtained if the statistics had been based upon the 
entire population rather than the sample. This 
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variability in statistics based on samples is 
referred to as sampling error. In statistics, 
variability and bias arising from other sources is 
commonly referred to as nonsampling error. It has 
been suggested that the semantic treatment of all 
other types of error as a residual category (i.e., 
nonsampling error) reflects the relatively 
underdeveloped state of thinking on that topic 
(Turner, 1980:43). 

10. Since these changes in elevation would "if correct, 
have surprising and important implications," the 
dispute over these measurements is of great 
significance to geologists and others concerned about 
the future of Southern California {Jackson et al., 
1980) . 

11. Examples include data collected by the Center for 
Disease Control that show a large variation in the 
estimates made by different laboratories of the 
amount of lead present in identical samples of 
blood. For a sample of blood with a putative lead 
concentration of 41 mg/dl, 100 cooperating 
laboratories produced measurements that covered a 
range of 33 to 55 mg/dl; this result prompted the 
reviewer to observe (Hunter, 1980:870) : "Clearly, 
whatever the true amount of lead in a sample, the 
variability demonstrated [in these measurements] 
guarantees numerous false alarms or perhaps more 
important, when the true level [of lead in the 
patient's blood] is high nonalarms." 

12. In addition to the well-known International Standard 
Units of Measurement, which are policed by the NBS, 
there are also more than 5,700 voluntary standards, 
procedures, specifications, and codes for measurements 
of all sorts. These have been approved by the 
government for use in contracting and federal 
regulation (U.S. Office of Management and Budget 
circular 43 FR-48-51; see Hunter, 1980). 

13. As discussed in an appendix to Volume 1, these 
estimates include only stories whose publication we 
can document. Based on a check of the coverage of 
our clipping files, we suspect that these totals 
represent only 10-15 percent of the actual number of 
published stories. 

14. Variables measured on a metric scale have equal 
spacings between categories. For example, the 
distance (interval) between the categories 20 and 21 
for the variable years of age is equivalent to that 
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between 50 and 51 , etc. Other types of scales 
include ordinal/ on which the categories are ordered 
(e.g., from low to high)/ but the intervals may be 
unequally spaced, for example, rank in the Army; and 
nominal/ on which the categories are not ordered/ for 
example, species of mammal (cow, whale, mouse) . 

15. A study by Stouffer (1955) on civil liberties was 
particularly interesting in its use of parallel 
surveys to offer reassurance about the integrity of 
the work on some rather controversial topics 
(attitudes toward communism and civil liberties in 
the United States in 1954). On the whole, his 
results suggested that the National Opinion Research 
Center and The Gallup Poll obtained similar results 
on the questions of central importance. 

16. The frequency and distribution of such discrepancies 
(relative to the number of comparisons) lend evidence 
that they have not occurred solely by chance, e.g., 
considerably more than 5 discrepancies per 100 fall 
outside the 95 percent confidence intervals for the 
comparison. 

17. These experiments were carried out through the 
generosity of five organizations that worked without 
payment. The Gallup Organization/ the Opinion 
Research Corporation, and The Washington Post poll 
added experimental studies to their surveys at the 
panel's request; the General Social Survey program of 
the National Opinion Research Center and the Question 
Form, Wording and Context Project of the Survey 
Research Center at Michigan, as part of ongoing 
programs of methodological research, conducted 
parallel experiments. 

18. The panel's procedure for approving recommendations 
was as follows. All recommendations were submitted 
in writing, and each was voted on separately by 
members of the panel, including the chair and staff. 
The inclusion of staff in the voting reflects the 
fact that the distinction between members and staff 
of the panel was less meaningful than may ordinarily 
be the case; one member of the staff, for example, 
was at the outset a member of the panel. However/ no 
recommendation was considered approved if 6 or more 
of the 11 current members of the panel voted in 
opposition. All persons voting in opposition to an 
approved recommendation were afforded the option of 
having their names recorded in dissent/ and here, as 
elsewhere in the report, separate statements were 
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permitted. Three such statements appear at the end 
of this report. 

The recommendations themselves were voted in the 
form shown in the following pages (i.e.r the 
underscored sentences) . The accompanying text and 
supporting materials were discussed by the panel and 
directions were given to the editors. Unlike the 
recommendations, the text materials were subject to 
subsequent editorial changes without the requirement 
that individual changes be resubmitted to the panel 
for voting. 

19. The titles of the individual articles convey their 
flavor: "Pollsters do their number on what black 
America thinks" (January 20) ; "Opinion polling by the 
Russians is oblique art" (February 3) ; "1980 brings 
more pollsters than ever" (February 16) ; "Polls, once 
scorned, gain new esteem" (April 5) ; "Experts find 
polls influence activists" (May 4); "The business of 
the pollsters" (June 29) ; "Abortion poll: not 
clear-cut; wording of a question makes a big 
difference" (August 18) . 

20. Two such articles by D. Greenberg appeared in 1980 in 
The Washington Post; "Polls and other superstitious 
rituals" (September 9) and "The plague of polling" 
(September 16) . 

21. Obviously, the evaluation by Paletz et al. would be 
unfair if most Times stories reported Times/CBS News 
surveys. Based on our assessment, this does not 
appear to be the case. 

We used the results of a search of The New York 
Times Information Bank (as current on January 15, 
1980) to compile a list of Times stories published in 
1978 whose abstracts contained the name of one or 
more of the polling and survey organizations listed 
in the Information Bank's index. We thus located 104 
stories that (according to their abstracts) reported 
poll results. (We excluded biographical pieces on 
pollsters, stories on polling as an enterprise, 
etc.) In the abstracts of these 104 stories we found 
112 references to specific polls or surveys (some 
stories referred to more than one poll) . Only 30 of 
these 112 references were to New York Times -CBS News 
polls. The remaining 82 references were made to 
surveys by: Gallup (43 references); Harris (13)? 
ABC-Harris (4); Roper (3); Yankelovich, Skelly and 
White (3) ; Opinion Research Corporation (2) ; 
Conference Board-National Family Opinion (2) ? 
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California Poll (2); Audits & Surveys (2); and with 
one reference each Cambridge Survey Research; Mervin 
Field; National Education Assocation poll of members; 
CBS news poll of delegates; Associated Press-NBC; 
Opinion Research Centre of U.K.; Market and Opinion 
Research International of U.K.; and Los Angeles Times 
poll. Unfortunately, we have no convenient method of 
judging the relative space of these stories, and, as 
we note later, the Information Bank is not an 
entirely reliable index. Nonetheless, it appears 
that the Paletz et al. evaluation does cover a 
significant and noteworthy segment of poll reports in 
The New York Times. 

22. We note and applaud the fact that some news media 
polls, e.g., CBS-N.Y. Times and ABC-Washington Post, 
produce regular summaries of their polls that include 
additional information not included in their news 
stories. Such documentation can serve as a useful 
supplement to details reported in the news stories, 
and some combination of a report of essential details 
(including sponsorship, wording, sampling error, 
etc.) in the story plus further technical information 
in a publicly available memo (for which a charge 
could be made) , might provide the most flexible means 
of ensuring adequate reporting. 

23. In Chapter 3 we review briefly five documents 
proposing or promulgating standards for polls and 
surveys. Dodd's (1947) proposed code to govern 
certification of opinion survey organizations; the 
Code of Professional Ethics and Practices of the 
American Association for Public Opinion Research 
(1979-80); the Principles of Disclosure of the 
National Council on Public Polls (1979) ; the Code of 
Standards for Survey Research of the Council of 
American Survey Research Organizations; and Circular 
A-46, "Standards and Guidelines for Federal 
Statistics," used by the Office of Management and 
Budget in reviewing proposed federal surveys (see 
Executive Office of the President, 1976). These 
documents differ in many ways, in particular in 
regard to their relative emphasis on technical versus 
ethical issues and on subjective versus objective 
variables. 

24. See, for example, the analysis and reinterpetation of 
presidential popularity measures by Sussman (1977) in 
The Washington Post, which is discussed in MacKuen 
and Turner (Volume 2), and articles by E. J. Dionne, 
Jr., and others, discussed in this chapter. 
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25. For example, in most issues of the Journal of the 
American Statistical Association, the book review 
columns do not have any coverage of actual survey 
reports, even though the "publications received" 
section lists numerous such reports. (Of course, 
books that include survey results are reviewed.) 

26. The idea that a national sample of 1,500 respondents 
is adequate for a survey may trace back to someone's 
calculation from the binomial distribution that with 
p = .5, two standard errors of the estimated 
proportion are approximately .026, so that the 
95-percent confidence interval implies an "error" of 
between 2 and 3 percentage points. (When p is larger 
or smaller than .5, this "error" is less.) 

27. This principle is well known in experimental studies, 
for which the idea of factorial design is now 
commonplace, startling as it was when Fisher 
introduced it more than a half-century ago. 

Note; Experimental designs can provide estimates 
of (specific) components of nonsampling variation if 
these components are identified beforehand and the 
relevant factors (e.g., question wording, interviewer 
training, question context, etc.) are experimentally 
varied in the design of the survey. 

28. Estimating the parameters of such models would 
require the use of factorial designs to assess the 
magnitude of the errors introduced at each step in 
the production of survey data. 

29. To be sure, the Lazarsfeld school of survey .analysis 
had developed a set of procedures corresponding to a 
well-articulated "language of social research," which 
featured cross-classification as a device for testing 
causal hypotheses (see, for example, Rosenberg, 
1968) . But the same group of research workers rather 
deliberately eschewed formal procedures of 
statistical modeling and inference. 
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Separate Statements 



STATEMENT, OTIS DUDLEY DUNCAN 

An important topic is not covered in the foregoing 
recommendations because the panel was unable to formulate 
a position commanding majority assent. I state my 
personal view in the interest of stimulating consideration 
of this topic by readers of our report. 

The topic is the role of federal agencies. Collection 
of data on subjective variables by federal statistical 
agencies should, in my opinion, be limited. This report 
acknowledges, indeed, emphasizes, that there are 
subjective aspects to many questions that ostensibly deal 
with matters of fact. So there is no possibility of 
producing social statistics without getting into 
subjective phenomena to some extent. But this concession 
should not be allowed to become the foot in the door 
whereby government statistics are expanded to cover the 
whole range of subjective variables. There are two 
considerations, one technical the other political, the 
former being much less important than the latter. 

On the technical side, I write with the experience of 
some three decades of working with government 
statisticians in various ways and of attempting to do 
research in both the factfinding and subjective modes. 
My observation is that the skills and sensitivities 
needed for good factfinding research are, for the most 
part, complementary to rather than coincident with those 
required for first-rate research focused on subjective 
variables. Storer (1969:6) may have recognized this 
state of affairs in proposing "the establishment of 
another separate survey research agency in the Federal 
government," No doubt such an agency could recruit 
survey experts knowledgeable about subjective phenomena 
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and thereby overcome any present technical limitations on 
the expansion of government statistics into the subjective 
area. Hence there is no need to dwell upon or over- 
emphasize any such limitations. 

The main concern is political. There are large parts 
of peoples' lives for which they should not be accountable 
in any way to the government, including what they think, 
feel, and believe. A panel that preceded us found that: 
"A substantial proportion of respondents expressed 
negative feelings about the value and accuracy of 
surveys, how interesting they are to respondents, the 
confidentiality of survey records, and the integrity of 
survey takers" (National Research Council, 1979:39). ' One 
can argue that a "substantial proportion" of the public 
is mistaken about these matters. Nonetheless, people 
have the right not to be badgered by their government 
into participating in an activity they do not believe 
in. The realm of the subjective is the last castle one 
may defend against the coercion of society and government 
and the vicissitudes of life and death. It is too 
precious to become the plaything of bureaucrats, however 
worthy their intentions to make a better life for us 
all. The U.S. Bureau of the Census has no business 
asking me or you "Are you very happy, pretty happy, or 
not too happy?" 

To be sure, apart from the legally mandatory decennial 
census, government surveys do not compel respondent 
cooperation. But people do not necessarily know this and 
even if so informed they well may wonder if non- 
participation can have adverse consequences. It is known 
that the Current Population Survey of the Bureau of the 
Census enjoys a much lower refusal rate than nonfederal 
surveys. In parallel surveys by the Census Bureau and 
Survey Research Center (University of Michigan) the 
(unweighted) refusal rates were 5.8 percent and 13.0 
percent, respectively (National Research Council, 1979:42 
(Table 1) ) . It may be that at least 7 percent of the 
American people believe that participation in a federal 
survey is required by law or is otherwise an obligation 
of the citizen. That is too many people undergoing the 
admittedly "subjective" experience of being coerced into 
revealing what is on their minds. 

In another publication (Duncan, 1972) I have argued 
persuasively that there are indeed things a government 
ought to know about how people feel, but that these are 
among the very things that government agents should be 
forbidden to ask. Whether people trust the government is 
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one. Whether the public thinks there should be religious 
observance in the schools may be another. But, as I 
noted in 1972, there are reputable survey and polling 
organizations that can collect relevant statistics. 
Statistics pertinent to wise and benign governance need 
not be produced by government statisticians. 

My recommendation is not controversial. It is, in 
fact, no more than a restatement of the policy of the 
Bureau of the Census, at any rate the one that prevailed 
a few years ago when, according to a high official 
(Levine, 1968), it "had a strong predilection for factual 
as opposed to attitudinal or motivational questions." Mr. 
Levine did go on to note that "it has been necessary to 
consider expanding the scope of the Bureau's work in 
recent years." I urge that the Bureau and sister federal 
statistical agencies continue to be stongly skeptical of 
all allegations of "necessity." 
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STATEMENT, BAR0CH PISCHHOFP 

One issue on which our panel declined to take a position 
was the appropriate level of federal government 
involvement in conducting surveys of subjective 
phenomena. I would like to note some arguments that 
motivated this decision. 

Adopting a policy on any issue obviously requires a 
consideration of the costs and benefits associated with 
that policy and the consequences associated with the 
alternatives to that policy. From either perspective, a 
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blanket recommendation regarding government involvement 
seems ill-advised for two reasons: 

First, one legitimate goal of government is to meet 
the needs of the populace as felt by the populace. Two 
potential benefits of surveys in this regard are: helping 
civil servants to understand those needs and providing 
them with feedback on the adequacy of their performance. 
The extent of such benefits arising from a particular 
survey depend upon how well it is done, how sensitively 
it is used, and how great the need for it was. It may be 
particularly useful when civil servants' intuitive 
"reading" of the public will is erroneous. In any case, 
the value of surveys must be assessed on a case-by-case 
basis, rather than judged categorically. Assessing the 
costs of a survey also requires an individualized 
analysis. For example, one wants to know: 

Will the respondents' participation be solicited in 
as non-coercive a way as possible (so that any 
increase in response rate could be interpreted as 
reflecting a greater desire to participate in federal, 
rather than private, surveys)? 

Will its results be broadly disseminated rather than 
being put to private (and possibly manipulative) use? 
Are its questions designed to direct policy (or just 
to satisfy curiosity)? 

Clearly, an assessment of the costs and benefits of a 
proposed survey should serve not only an evaluative 
function, but also suggest ways to make it more 
attractive. 

Second, if the government were to stop conducting 
surveys of subjective phenomena, one must ask what will 
come in its stead. Perhaps the most benign possibility 
is increased reliance on political punditry, with those 
individuals best able to gain a hearing for their views 
expounding on what the public thinks and wants. A more 
troublesome possibility is that surveys will continue to 
be conducted, but that they will be primarily proprietary 
in nature. That is, they will be conducted at the behest 
of those individuals or organizations with the resources 
to commission them. The results of such surveys may then 
be used to strategic advantage, perhaps being kept in 
secret, perhaps being released selectively. An individual 
who questioned the intentions of these institutions might 
argue that if "surveys are outlawed, only outlaws will 
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have surveys." Government, on the other hand, is required 
to reveal the results of the studies it conducts or 
commissions. Moreover, it is entrusted with acting in 
the public's best interest and subject to the censure of 
that public should it fail to do so. 

None of these comments, of course, would excuse the 
conduct of knowingly flawed or biased surveys, or the 
misrepresentation of their results for manipulative 
purposes. However, even such excesses reflect more 
poorly on those responsible for them than on the medium 
they use. 



DISSENT TO RECOMMENDATION 6, HOWARD SCHUMAN* 

Independent peer review is desirable for all scientific 
research projects that have important goals or raise 
controversial issues. However , there is no compelling 
reason to single out all surveys, especially without 
qualification concerning their goals. The further 
stipulation regarding "paid reviews" raises troublesome 
problems regarding the independence of such reviews. In 
general, this recommendation needs to be carefully 
thought through before being endorsed. 



*Lester Prankel and Tom Smith join in this dissent. 



R 







